Queue
Queue
2)
1.Introduction
Queues are useful to solve various system programs. Some simple applications of queues in our everyday
life as well as in computer science.
2 DEFINITION
Like a stack, a queue is an ordered collection of homogeneous data elements; in contrast with the stack,
here, insertion and deletion operations take place at two extreme ends. A queue is also a linear data structure
like an array, a stack and a linked list where the ordering of elements is in a-linear fashion. The only
difference between a stack and a queue is that in the case of stack insertion and deletion (PUSH and POP)
operations are at one end (TOP) only, but in a queue insertion (called ENQUEUE) and deletion (called
DEQUEUE) operations take place at two ends called the REAR and FRONT of the queue, respectively.
Figure represents a model of a queue structure. Queue is also termed first-in first-out (FIFO)
3 REPRESENTATION OF QUEUES
There are two ways to represent a queue in memory: Using an array & Using a linked list
The first kind of representation uses a one-dimensional array and it is a better choice where a queue of fixed
size is required. The other representation uses a double linked list and provides a queue whose size can vary
during processing.
Two states of the queue, either empty or containing some elements, can be judged by the following tests:
The following two algorithms describe the insertion and deletion operations on a circular queue.
4
4.2 Deque
Another variation of the queue is known as deque (may be pronounced 'deck'). Unlike a queue, in deque,
both insertion and deletion operations can be made at either end of the structure. Actually, the term deque
has originated from double ended queue. Such a structure is shown in Figure.
It is clear from the deque structure that it is a general representation of both stack and queue. In other words,
a deque can be used as a stack as well as a queue. There are various ways of representing a deque on the
computer. One simpler way to represent it is by using a double linked list. Another popular representation is
using a circular array (as used in a circular queue).
The following four operations are possible on a deque which consists of a list of items:
1.Push_DQ(ITEM): To insert ITEM at the FRONT end of a deque.
2.Pop_DQ( ): To remove the FRONT item from a deque.
3.Inject(ITEM): To insert ITEM at the REAR end of a deque.
4.Eject( ): To remove the REAR ITEM from a deque.
These operations are described for a deque based on a circular array of length LENGTH.
Let the array be DQ[1 ... LENGTH].
5
There are, however, two known variations of deque: Input-restricted deque and Output-restricted deque.
These two types of variations are actually intermediate between a queue and a deque. Specifically, an input-
restricted deque is a deque which allows insertions at one end (say REAR end) only, but allows deletions at
both ends. Similarly, an output-restricted deque is a deque where deletions take place at one end only (say
FRONT end), but allows insertions at both ends. Figure represents two such variations of deque.
(a) Starting from the FRONT pointer, traverse the array for an element of the highest priority. Delete this
element from the queue. If this is not the front-most element, shift all its trailing elements after the
deleted element one stroke each to fill up the vacant position(see figure) This implementation,
however, is very inefficient as it involves searching the queue for the highest priority element and
shifting the trailing elements after the deletion. A better implementation is as follows:
(b) Add the elements at the REAR end as earlier. Using a stable sorting algorithm", sort the elements of
the queue so that the highest priority element is at the FRONT end. When a deletion is required, delete
it from the FRONT end only (see Figure).
6
The second implementation is comparatively better than the first one; here the only burden is to sort the
elements.
Multi-queue implementation
This implementation assumes N different priority values. For each priority Pi there are two pointers Fi and
Ri corresponding to the FRONT and REAR pointers respectively. The elements between Fi and Ri are all of
equal priority value Pi. Figure represents a view of such a structure.
With this representation, an element with priority value Pi will consult Fi for deletion and Ri for insertion.
But this implementation is associated with a number of difficulties:
(i) It may lead to a huge shifting in order to make room for an item to be inserted.
(ii) A large number of pointers are involved when the range of priority values is large.
In addition to the above, there are two other techniques to represent a multi-queue, which are shown in
Figures (a) and (b). It is clear from Figure (a) that for each priority value a simple queue is to be maintained.
An element will be added into a particular queue depending on its priority value. The priority queue as
shown in Figure (b) is in some way better than the multi-queue with multiple queues. Here one can get rid of
maintaining several pointers for FRONT and REAR in several queues. A multi-queue with multiple queues
has one advantage that one can have different queues of arbitrary length. In some applications, it is seen that
the number of occurrences of elements with some priority value is much larger than the other value, thus
demanding a queue of larger size.
With this structure, to delete an item having priority p, the list will be searched starting from the node under
pointer REAR and the first occurring node with PRIORITY = P will be deleted. Similarly, to insert a node
containing an item with priority p, the search will begin from the node under the pointer FRONT and the
node will be inserted before a node found first with priority value p, or if not found then before the node
with the next priority value. The following two algorithms Insert PQ and Delete_PQ are used to implement
the insertion and deletion operations on a priority queue.
7
5. APPLICATIONS OF QUEUES
Numerous applications of queue structures are known in computer science. One major application of queues
is in simulation. Another important application of queues is observed in the implementation of various
aspects of an operating system. A multiprogramming environment uses several queues to control various
programs. Various scheduling algorithms are known to use varieties of queue structures.
5.2 CPU Scheduling in a Multiprogramming Environment
In a multiprogramming environment, a single CPU has to serve more than one program simultaneously. This
section gives a brief idea about how queues are important to manage various programs in such an
environment. Let us consider a multiprogramming environment where the possible jobs for the CPU are
categorized into three groups:
1.Interrupts to be serviced. A variety of devices and terminals are connected to the CPU and they may
interrupt the CPU at any moment to get a particular service from it.
2.Interactive users to be serviced. These are mainly user's programs under execution at various terminals.
3.Batch jobs to be serviced.
Here the problem is to schedule all sorts of jobs so that the required level of performance of the environment
will be attained. One way to implement complex scheduling is to classify the workload according to its
characteristics and to maintain separate process queues. So far as the environment is concerned, we can
maintain three queues, as depicted in Figure. This approach is often called multi-level queues scheduling.
Processes will be assigned to their respective queues. The CPU will then service the processes as per the
priority of the queues. In the case of a simple strategy, absolute priority, the process from the highest priority
queue (for example, system processes) are serviced until the queue becomes empty. Then the CPU switches
to the queue of interactive processes which has medium priority, and so on. A lower priority process may, of
course, be pre-empted by a higher-priority arrival in one of the upper level queues.
Multi-level queues strategy is a general discipline but has some drawbacks. The main drawback is that when
processes arriving in higher-priority queues are very high, the processes in a lower-priority queue may starve
for a long time. One way out to solve this problem is to time slice between the queues. Each queue gets a
certain portion of the CPU time. Another possibility is known as multi-level feedback queue strategy.
Normally in multi-level queue strategy, as we have seen, processes are permanently assigned to a queue
upon entry to the system and processes do not move between queues. The multi-level feedback queue
strategy, on the contrary, allows a process to move between queues. The idea is to separate out the
processes with different CPU burst characteristics. If a process uses too much of CPU time (that is, long run
process), it will be 'moved to a lower-priority queue. Similarly, a process which is waiting for too long a
time in a lower-priority queue, may be moved to a higher-priority queue. For example, consider a multi-level
feedback queue strategy with three queues Q1, Q2 and Q3 (Figure 5.21).
A process entering the system is put in queue Ql. A process in Ql is given a time quantum λ of 10 ms, say. If
it does not finish within this time, it is moved to the tail of queue Q2. If Ql is empty, the process at the front
of queue Q2 is given a time quantum ґ of 20 ms, say. If it does not complete within this time quantum, it is
pre-empted and put into queue Q3. Processes in queue Q3 are serviced only when queues Ql and Q2 are
empty. Thus, with this strategy, the CPU first executes all processes in queue Q1. Only when Q1 is empty it
will execute all processes in queue Q2. Similarly, processes in queue Q3 will only be executed if only
queues Ql and Q2 are empty. A process which arrives in queue Q1 will prempt a process in queue Q2 or Q3.
It can be observed that this strategy gives the highest priority to any process with a CPU burst of 10 ms or
less. Processes which need more than 10 ms, but less than or equal to 20 ms are also served quickly, that is,
they get the next highest priority over the shorter processes. Longer processes automatically sink to queue
Q3; from Q3, processes will be served on a first- come first-serve (FCFS) basis and in the case of a process
waiting for too long a time (as decided by the scheduler) it may be put into the tail of queue Q1.
8
5.3 Round Robin Algorithm
The round robin (RR) algorithm is a well-known scheduling algorithm and is designed especially for time
sharing systems. Here, we will see how a circular queue can be used to implement such an algorithm. Before
going to implement the RR algorithm, we should first describe the algorithm with illustration. Suppose,
there are n processes P1, P2, ... , P n required to be served by the CPU. Different processes require different
execution times. Suppose, the sequence of processes' arrivals according to their subscripts, that is, PI comes
before P2 and, in general, Pi comes after P i-1 for 1 <i<=n. The RR algorithm first decides a small unit of
time, called a time quantum or time slice, λ. A time quantum is generally from 10 to 100 milliseconds. The
CPU starts service with P1. P1 gets the CPU for time λ; afterwards the CPU switches to P2, and so on. When
the CPU reaches the end of time quantum of Pn it returns to P1 and the same process will be repeated.
Now, during time sharing, if a process finishes its execution before the finishing of its time quantum, the
process then simply releases the CPU and the next process in waiting will get the CPU immediately. The
total CPU time required is 30 unit. Let us assume a time quantum of 4 unit. The RR scheduling for this will
be as shown in Figure.
See the result by repeating the calculations but using the sequence of processes as P2, P1 and P3. In time
sharing systems any process may arrive at any instant of time. Generally, all the processes currently under
execution are maintained in a queue. When a process finishes its execution it is deleted from the queue and
whenever a new process arrives it is inserted at the tail of the queue and waits for its turn. To illustrate this,
let us consider Table 5.3.
The total CPU time required is 25 units. Let the time quantum be t = 5 unit. Figure 5.23 illustrates the
snapshot at various instants with RR scheduling. Now let us discuss the implementation of the RR
scheduling algorithm. A circular queue is the best choice for it. It may be noted that it is not strictly a
circular queue, because here a process upon completion is deleted from the queue and it is not necessarily
from the front of the queue rather it can be from any position of the queue. Except this, RR scheduling
follows all the properties of a queue, that is, the process which comes first gets its turn first. The
implementation of the RR algorithm using a circular queue is straightforward. Here, we use a variable sized
circular queue; the size of the queue at any instant is decided by the number of processes in execution at that
instant. Another mechanism is necessary; whenever a process is deleted, to fill the space of the deleted
process, it is required to squeeze all the processes preceding to it, starting from the front pointer(Figure5.24).
9
6. HASH TABLE
There are other types of tables which help us to retrieve information very efficiently. The ideal hash table is
merely an array of some constant size; the size depends on the application where it will be used. The hash
table contains key values with pointers to the corresponding records. The basic idea of a hash table is that we
have to place a key value into a location in the hash table; the location will be calculated from the key value
itself. This one-to-one correspondence between a key value and an index in the hash table is known as
address calculation indexing or more commonly hashing. In the present section, we will discuss hashing
techniques and their related issues.
It may be noted that the mapping is subjective, that is all key values are mapped into some indices and more
than one key value may be mapped into an index value. The function that governs this mapping is called the
hash function. A particular hashing technique uses a particular hash function. The hash function plays a
dominant role in hashing techniques. There are two principal criteria in deciding a hash function H:K ~ I as
follows:
1.The function H should be very easy and quick to compute.
2.The function H should as far as possible give two different indices for two different key values.
As an example, let us consider a hash table of size 10 whose indices are 0, 1, 2, ... , 8, 9. Suppose a set of key
values are: 10, 19, 35, 43, 62, 59, 31, 49, 77, 33. Let us assume the hash function H is as stated below:
∑ Add the two digits in the key.
∑ Take the digit at the unit place of the result as the index; ignore the digit at the tenth place, if any.
Using this hash function, the mappings from key values to indices and to hash table are shown in Figure. In
this example, for the given set of key values, the hash function does not distribute them uniformly over the
hash table; some entries are there which are empty, and in some entries more than one key value needs to be
stored. Allotment of more than one key value in one location in the hash table is called collision. We have
found three collisions for 62, 31 and 77 in the above-mentioned example. It can be noted that |K| = |I|, that
is, the number of key values is the same as the size of the hash table, but this is not the case always. In
general, |K| > | I |. The following are some hash functions which are very common and popularly applied in
various applications.
10
Division method
One of the fast hashing functions, and perhaps the most widely accepted, is the division method, which is
defined as follows:
Choose a number h . larger than the number N of keys in K. The hash function H is then defined by
H(k) = k(MOD h) if indices start from 0
H(k) = k(MOD h) + 1 if indices start from 1
where k € K, a key value. The operator MOD defines the modulo arithmetic operation, which is equal to the
remainder of dividing k by h. For example, if k = 31 and h = 13 then
H(31) = 31(MOD 13) = 5
or
H(31) = 31(MOD 13) + 1 = 6
The number h is usually chosen to be a prime number or a number without small divisors, since this usually
minimizes the number of collisions. Generally, h is a prime number and equal to the size of the hash table.
Midsquare method
Another hash function which has been widely used in many applications is the midsquare method. The
method is defined as follows:
The hash function H is defined by H(k) = x, where x is obtained by selecting an appropriate number of bits
or digits from the middle of the square of the key value k. This selection usually depends on the size of the
hash table. It needs to be emphasized that the same criteria should be used for selecting the bits or digits for
all of the keys. As an example, suppose the key values are of the integer type, and we require 3-digit
addresses. Our selection criteria are to select 3 digits at even positions starting from the right- most digit in
the square. Let us see the address calculations, for 3 distinct keys and with the hash function, as defined
above:
Here, we observe that the second, the fourth, and the sixth digits, counting from the right, are chosen for the
hash addresses. The midsquare method has been criticized because of time-consuming computation
(multiplication operation), but it usually gives good results so far as the uniform distribution of the keys over
the hash table is concerned.
Folding method
Another fair method for a hash function is the folding method. The method can be defined as follows:
Partition the key k into a number of parts kb k2' ... , kno where each part, except possibly the last, has the
same number of bits or digits as the required address width. Then the parts are added together, ignoring the
last carry, if any. Alternatively, H(k) = k1 + k2 + ... + kn
where the last carry, if any, is ignored. If the keys are in binary form, the exclusive-OR operation may be
substituted for addition. There are many variations known in this method. One is called the fold shifting
method, where the even number parts, k2, k4, ... are each reversed before the addition. Another variation is
called the fold boundary method. Here, two boundary parts, namely, k1 and kn, each are reversed and then
added to all other parts. As an example, let us take the size of each part to be 2; the following calculations
are performed on the given key values (integers) as shown below.
Folding is a hashing function which is also useful in converting multi-word keys into a single word so that
another hashing function can be used on that. In fact, the term 'hashing' comes from this technique of
'chopping' a key into pieces.
11
decision for extraction and then rearrangement is based on some analysis. To do this, an analysis is
performed to determine which key positions should be used in forming hash addresses. For each criterion,
hash addresses are calculated and then a graph is plotted, then that criterion is selected which produces the
most uniform distribution, that is with the smallest peaks and valleys. This method is particularly useful in
the case of static files where the key values of all the records are known in advance. We have assumed the
key values as integers in our previous discussions, but it need not be so always. In fact, any key value can be
represented by a string of characters and then ASCII values of its constituent characters can be taken to
convert it into a numeric value. Thus, assuming that a key value k = k1k2k3 ... km where each ki is the
constituent character in k. The hash function using the division method is stated as below in algorithm
HashDivision.
Open the calendar of the year of their birth. Assume that there are 365 days. Start with any student, and put a
tick on his birthday date on the calendar. Now, the probability that the second student has a different
birthday is 364/365. Tick this date off. The probability that a third student has a different birthday is now
363/365. Continuing this way, we see that if the first (n - 1) students have different birthdays, then the
probability that the nth student has a different birthday is
Since the birthdays of different people are independent, we obtain the probability that n students all have a
different birthday is
This probability can be calculated as less than 0.5 whenever n>= 24.
In other words, suppose there is a hash table of size 365 and we want to store the records of all the 24
students based on birthdays as their key values. It is therefore a fifty-fifty chance that two of the students
have the same birthday and hence a collision. So, collision in hashing cannot be ignored, whatever be the
size of the hash table. The next question arises therefore is what to do if there is a collision? There are
several techniques to resolve the collisions. Two important methods are listed below:
(a) Closed hashing (also called linear probing)
(b) Open hashing (also called chaining).
12
Start with the hash address where the collision has occurred, let it be i. Then follow the following sequence
of locations in the hash table and do the sequential search. i, i + 1, i + 2, ... , h, 1, 2, ... , i-1 The search will
continue until anyone of the following cases occurs:
∑ The key value is found.
∑ An unoccupied (or empty) location is encountered.
∑ The searches reaches the. location where the search had started.
The first case corresponds to the successful search and the last two cases correspond to unsuccessful search.
Here the hash table is considered circular, so that when the last location is reached, the search proceeds to
the first location of the table. This is why the technique is termed closed hashing. Since the technique
searches in a straight line, it is also alternatively termed linear probing; probe means key comparison. Let us
illustrate the method with an example. Assume that there is a hash table of size 10 and the hash function uses
the division method with remainder modulo 7, namely, H(k) = k MOD (7 + 1). Let us consider the build up
of the hash table (initially, the table is empty) with the following set of key values: 15 11 25 16 9 8 12 8
The loading of the hash table will take place successively by performing a search for a key and inserting it
into the table in an empty room if the key is not in the table and leaving if it is overflow, that is, no free room
to accommodate any further key value. This is illustrated in Figure.
Next, let us define the operation for searching a key-value and inserting a key-value. The algorithm
HashLinearProbe for searching a key value K in a hash table of size HSIZE is given below:
Note Step 9 in the above algorithm. Here, we assume that whenever a key value is deleted from the hash
table its corresponding entries are made negative instead of NULL. Writing an algorithm for deleting a key
value is straightforward and is left as an exercise.
13
highest location of the hash table. An example of a pseudo random number generator that produces such a
random sequence of locations is given below: i = (i + m) MOD h + 1
where i is a number in the sequence, and m and h are integers that are relatively prime to each other (that is,
their greatest common divisor is 1). For example, suppose m = 5 and h = 11 and itially i = 2, then the above-
mentioned pseudo random number generator generates the sequence as: 8, 3, 9, 4, 10, 5, 11, 6, 1, 7, 2
We stop producing the numbers when the first location is duplicated. Observe that here all the numbers
between 1 and 11 are generated but randomly. We can avoid primary clustering if the probe follows the said
random sequence.
Double hashing: Random hashing however is not free form clustering. Another type of clustering, called
secondary clustering, is involved here. In particular, clustering occurs when two keys are hashed into the
same location. In such an instance, if the same sequence of locations is generated for two different keys by
the random probing method then clustering takes place. An alternative approach to avoid the secondary
clustering problem is to use a second hash function in addition to the first one. This second hash function
results in the value of m for the pseudo random number generator as employed in the random probing
method. This second function should be selected in such a way that the hash addresses generated by the two
hash functions are distinct and the second function generates a value m for the key k so that m and h are
relatively prime. Let us consider the following example.
Suppose H1(k) is the initially used hash function and H2(k) is the second one. These two functions are
defined as
H1(k) = (k MOD h) + 1
H2(k) :::: (k MOD (h - 4)) + 1
Let h = 11 and k = 50 for an instance. Then, H1(50) = 7 and H2(50) = 2. Therefore, H1(50)!=H2(50), that is,
H1 and H2 are independent and m = 2, h = 11 are relatively prime. Hence, using i = [(i + 2) MOD 11] + 1,
and initially i = 7, we have the random sequence as 10, 2, 5, 8, 11, 3, 6, 9, 1, 4, 7
Now, let us choose another key value which has the same hash address as that of 50 (that is, 7) with the first
hash function H1. Let it be 28 (since H1(28) = 28 MOD 11 + 1 = 7). Then H2(28) = 28 MOD 7 + 1 = 5
So using i = [(i + m) MOD 11] with i = 7 and m = 5, we get the sequence: 2, 8, 3, 9, 4, 10, 5, 11, 6, 1, 7
Thus, for the two key values where the hash address is the same and using rehashing, two different random
sequences are generated, thereby alleviating the secondary clustering.
Quadratic probing: Quadratic probing is a collision resolution method that eliminates the primary
clustering problem of linear probing. For linear probing, if there is a collision at location i, then the next
locations i + 1, i + 2, i + 3, etc. are probed; but in quadratic probing, the next locations to be probed are i +
12, i + 22, i + 32, etc. Mathematically, if h is the size of the hash table and H(k) is the hash function then the
quadratic probing searches the locations:
14
15