Unit-3: Components of Greedy Algorithm
Unit-3: Components of Greedy Algorithm
Among all the algorithmic approaches, the simplest and straightforward approach is the
Greedy method. In this approach, the decision is taken on the basis of current available
information without worrying about the effect of the current decision in future.
Greedy algorithms build a solution part by part, choosing the next part in such a way,
that it gives an immediate benefit. This approach never reconsiders the choices taken
previously. This approach is mainly used to solve optimization problems. Greedy
method is easy to implement and quite efficient in most of the cases. Hence, we can
say that Greedy algorithm is an algorithmic paradigm based on heuristic that follows
local optimal choice at each step with the hope of finding global optimal solution.
In many problems, it does not produce an optimal solution though it gives an
approximate (near optimal) solution in a reasonable time.
A selection function − Used to choose the best candidate to be added to the solution.
A feasibility function − Used to determine whether a candidate can be used to contribute to the solution.
A solution function − Used to indicate whether a complete solution has been reached.
Areas of Application
Finding the minimal spanning tree in a graph using Prim’s /Kruskal’s algorithm, etc.
In many problems, Greedy algorithm fails to find an optimal solution, moreover it may
produce a worst solution. Problems like Travelling Salesman and Knapsack cannot be
solved using this approach.
All data structures are combined, and the concept is used to form a specific algorithm. All algorithms are designed with a motive to achieve
the best solution for any particular problem. In the greedy algorithm technique, choices are being made from the given result domain. As
being greedy, the next to a possible solution that looks to supply the optimum solution is chosen.
The greedy method is used to find restricted most favorable result which may finally land in globally optimized answers. But usually, greedy
A game like chess can be won only by having ideas ahead: a player who is alert entirely on immediate benefit is easy to defeat. But in some
other games such as Scrabble, it is likely to do quite well by just making whichever move seems finest at the moment and not worrying too
piece by piece, by constantly choosing the next piece which offers the most obvious and instant benefit. Although this kind of approach can
be disastrous for some computational jobs yet there are many for which it is best suitable. The first example that you will be going to
Suppose you are invited to create a networked collection of computers by connecting selected pairs of those computers. This translates this
into a graph problem wherein all nodes are computers, undirected edges are possible links, and the objective is to pick enough of these
edges which the nodes are associated with. And that is not all; each link has a maintenance cost which will reflect in those edge's weight.
So what you can do is make a graph that will be having six vertices/nodes named A, B, C, D, E, and F and assign each edge with a value.
Furthermore, there are lots of related problems that use the concept of the greedy algorithm for finding the most favorable solution.
This approach does not guarantee a global optimal solution since it never looks back at the choices made for finding the local
optimal solution.
Amortized analysis is a worst-case analysis of a a sequence of operations — to obtain a tighter bound on the overall or average cost per operation in the
sequence than is obtained by separately analyzing each operation in the sequence. For instance, when we considered the union and find operations for the
disjoint set data abstraction earlier in the semester, we were able to bound the running time of individual operations by O(log n). However, for a sequence
of n operations, it is possible to obtain tighter than an O(n log n) bound (although that analysis is more appropriate to 4820 than to this course). Here we will
consider a simplified version of the hash table problem above, and show that a sequence of n insert operations has overall time O(n).
The aggregate method, where the total running time for a sequence of operations is analyzed.
The accounting (or banker's) method, where we impose an extra charge on inexpensive operations and use it to pay for expensive
The potential (or physicist's) method, in which we derive a potential function characterizing the amount of extra work we can do in each
step. This potential either increases or decreases with each successive operation, but cannot be negative.
Consider an extensible array that can store an arbitrary number of integers, like an ArrayList or Vector in Java. These are implemented
in terms of ordinary (non-extensible) arrays. Each add operation inserts a new element after all the elements previously inserted. If there are no empty cells
left, a new array of double the size is allocated, and all the data from the old array is copied to the corresponding entries in the new array. For instance,
+--+
Insert 11 |11|
+--+
+--+--+
Insert 12 |11|12|
+--+--+
+--+--+--+--+
Insert 13 |11|12|13| |
+--+--+--+--+
+--+--+--+--+
Insert 14 |11|12|13|14|
+--+--+--+--+
+--+--+--+--+--+--+--+--+
Insert 15 |11|12|13|14|15| | | |
+--+--+--+--+--+--+--+--+
The table is doubled in the second, third, and fifth steps. As each insertion takes O(n) time in the worst case, a simple analysis would yield a bound of O(n2)
time for n insertions. But it is not this bad. Let's analyze a sequence of n operations using the three methods.
5. It involves feasibility function.
Branch-and-Bound
1. It is used to solve optimization problem.
Bin Packing
Bin Packing Problems belongs to the NP-hard problem. It's basically about packing bins with certain items of different sizes with objectives like:
-packing in most time efficient way,
-pack the items so the items are distributed evenly
- pack the items into the bin so you use as least bins as possible
There are different ways to achieve that, each have their own advantages. These are the algorithms that exist:
-Next fit / Next fit decreasing
-First fit/ First fit decreasing
-Worst fit/ Worst fit decreasing
- Full fit/Best fit
Bin Packing Problem (Minimize number of used Bins)
Given n items of different weights and bins each of capacity c, assign each item to a bin such that number of total used bins is minimized. It
may be assumed that all items have weights smaller than bin capacity.
Example:
Input: wieght[] = {4, 8, 1, 4, 2, 1}
Bin Capacity c = 10
Output: 2
We need minimum 2 bins to accommodate all items
First bin contains {4, 4, 2} and second bin {8, 2}
Lower Bound
We can always find a lower bound on minimum number of bins required. The lower bound can be given as :
In the above examples, lower bound for first example is “ceil(4 + 8 + 1 + 4 + 2 + 1)/10” = 2 and lower bound in second example is “ceil(9 + 8
+ 2 + 2 + 5 + 4)/10” = 3.
This problem is a NP Hard problem and finding an exact minimum number of bins takes exponential time. Following are approximate
algorithms for this problem.
Online Algorithms
These algorithms are for Bin Packing problems where items arrive one at a time (in
unknown order), each must be put in a bin, before considering the next item.
1. Next Fit:
When processing next item, check if it fits in the same bin as the last item. Use a new
bin only if it does not.
Number of bins required in Next Fit : 4
Next Fit is a simple algorithm. It requires only O(n) time and O(1) extra space to process n items.
Next Fit is 2 approximate, i.e., the number of bins used by this algorithm is bounded by twice of optimal. Consider any two adjacent bins. The
sum of items in these two bins must be > c; otherwise, NextFit would have put all the items of second bin into the first. The same holds for all
other bins. Thus, at most half the space is wasted, and so Next Fit uses at most 2M bins if M is optimal.
2. First Fit:
When processing the next item, scan the previous bins in order and place the item in the first bin that fits. Start a new bin only if it does not
fit in any of the existing bins.
Applications
1. Loading of containers like trucks.
2. Placing data on multiple disks.
3. Job scheduling.
4. Packing advertisements in fixed length radio/TV station breaks.
5. Storing a large collection of music onto tapes/CD’s, etc.
Heuristics
The term heuristic is used for algorithms which find solutions among all possible
ones ,but they do not guarantee that the best will be found,therefore they may be
considered as approximately and not accurate algorithms.These algorithms,usually
find a solution close to the best one and they find it fast and easily.Sometimes these
algorithms can be accurate,that is they actually find the best solution, but the
algorithm is still called heuristic until this best solution is proven to be the best.The
method used from a heuristic algorithm is one of the known methods,such as
greediness,but in order to be easy and fast the algorithm ignores or even
suppresses some of the problem's demands.
A heuristic technique, often called simply a heuristic, is any approach to problem solving, learning, or discovery that employs a
practical method not guaranteed to be optimal or perfect, but sufficient for the immediate goals. Where finding an optimal solution
is impossible or impractical, heuristic methods can be used to speed up the process of finding a satisfactory solution. Heuristics
can be mental shortcuts that ease the cognitive load of making a decision. Examples of this method include using a rule of thumb,
an educated guess, an intuitive judgement, guesstimate, stereotyping, profiling, or common sense.
“A heuristic technique, often called simply a heuristic, is any approach to problem solving, learning, or discovery that employs a practical
method not guaranteed to be optimal or perfect, but sufficient for the immediate goals. Where finding an optimal solution is impossible or
impractical, heuristic methods can be used to speed up the process of finding a satisfactory solution. Heuristics can be mental shortcuts that
ease the cognitive load of making a decision. Examples of this method include using a rule of thumb, an educated guess, an intuitive
judgement, guesstimate, stereotyping, profiling, or common sense.”
“In computer science, a heuristic is a technique designed for solving a problem more quickly when classic methods are too slow, or for
finding an approximate solution when classic methods fail to find any exact solution. This is achieved by trading optimality, completeness,
accuracy, or precision for speed. In a way, it can be considered a shortcut.”
The objective of a heuristic algorithm is to apply a rule of thumb approach to produce a solution in a reasonable time frame that is good
enough for solving the problem at hand. There is no guarantee that the solution found will be the most accurate or optimal solution for
the given problem. We often refer the solution as “good enough” in most cases.
Heuristic Algorithms?
Problem Statement
A traveler needs to visit all the cities from a list, where distances between all the cities
are known and each city should be visited just once. What is the shortest possible
route that he visits each city exactly once and returns to the origin city?
Solution
Travelling salesman problem is the most notorious computational problem. We can use
brute-force approach to evaluate every possible tour and select the best one.
For n number of vertices in a graph, there are (n - 1)! number of possibilities.
Instead of brute-force using dynamic programming approach, the solution can be
obtained in lesser time, though there is no polynomial time algorithm.
Let us consider a graph G = (V, E), where V is a set of cities and E is a set of weighted
edges. An edge e(u, v) represents that vertices u and v are connected. Distance
between vertex u and v is d(u, v), which should be non-negative.
Suppose we have started at city 1 and after visiting some cities now we are in city j.
Hence, this is a partial tour. We certainly need to know j, since this will determine
which cities are most convenient to visit next. We also need to know all the cities
visited so far, so that we don't repeat any of them. Hence, this is an appropriate sub-
problem.
For a subset of cities S Є {1, 2, 3, ... , n} that includes 1, and j Є S, let C(S, j) be the
length of the shortest path visiting each node in S exactly once, starting at 1 and
ending at j.
When |S| > 1, we define C(S, 1) = ∝ since the path cannot start and end at 1.
Now, let express C(S, j) in terms of smaller sub-problems. We need to start at 1 and
end at j. We should select the next city in such a way that
C(S,j)=minC(S−{j},i)+d(i,j)wherei∈Sandi≠jc(S,j)=minC(s−{j},i)
+d(i,j)wherei∈Sandi≠j
Algorithm: Traveling-Salesman-Problem
C ({1}, 1) = 0
for s = 2 to n do
for all subsets S Є {1, 2, 3, … , n} of size s and containing 1
C (S, 1) = ∞
for all j Є S and j ≠ 1
C (S, j) = min {C (S – {j}, i) + d(i, j) for i Є S and i ≠ j}
Return minj C ({1, 2, 3, …, n}, j) + d(j, i)
Analysis
There are at the most 2n.n sub-problems and each one takes linear time to
Example
In the following example, we will illustrate the steps to solve the travelling salesman
problem.
1 0 10 15 20
2 5 0 9 10
3 6 13 0 12
4 8 8 9 0
S=Φ
Cost(2,Φ,1)=d(2,1)=5Cost(2,Φ,1)=d(2,1)=5
Cost(3,Φ,1)=d(3,1)=6Cost(3,Φ,1)=d(3,1)=6
Cost(4,Φ,1)=d(4,1)=8Cost(4,Φ,1)=d(4,1)=8
S=1
Cost(i,s)=min{Cost(j,s–(j))+d[i,j]}Cost(i,s)=min{Cost(j,s)−(j))+d[i,j]}
Cost(2,{3},1)=d[2,3]+Cost(3,Φ,1)=9+6=15cost(2,
{3},1)=d[2,3]+cost(3,Φ,1)=9+6=15
Cost(2,{4},1)=d[2,4]+Cost(4,Φ,1)=10+8=18cost(2,
{4},1)=d[2,4]+cost(4,Φ,1)=10+8=18
Cost(3,{2},1)=d[3,2]+Cost(2,Φ,1)=13+5=18cost(3,
{2},1)=d[3,2]+cost(2,Φ,1)=13+5=18
Cost(3,{4},1)=d[3,4]+Cost(4,Φ,1)=12+8=20cost(3,
{4},1)=d[3,4]+cost(4,Φ,1)=12+8=20
Cost(4,{3},1)=d[4,3]+Cost(3,Φ,1)=9+6=15cost(4,
{3},1)=d[4,3]+cost(3,Φ,1)=9+6=15
Cost(4,{2},1)=d[4,2]+Cost(2,Φ,1)=8+5=13cost(4,
{2},1)=d[4,2]+cost(2,Φ,1)=8+5=13
S=2
Cost(2,{3,4},1)={d[2,3]+Cost(3,{4},1)=9+20=29d[2,4]+Cost(4,{3},1)=10+15=25=25Cost(2,{3,4},1){d[2,3]+cost(3,
{4},1)=9+20=29d[2,4]+Cost(4,{3},1)=10+15=25=25
Cost(3,{2,4},1)={d[3,2]+Cost(2,{4},1)=13+18=31d[3,4]+Cost(4,{2},1)=12+13=25=25Cost(3,{2,4},1)
{d[3,2]+cost(2,{4},1)=13+18=31d[3,4]+Cost(4,{2},1)=12+13=25=25
Cost(4,{2,3},1)={d[4,2]+Cost(2,{3},1)=8+15=23d[4,3]+Cost(3,{2},1)=9+18=27=23Cost(4,{2,3},1){d[4,2]+cost(2,
{3},1)=8+15=23d[4,3]+Cost(3,{2},1)=9+18=27=23
S=3
Cost(1,{2,3,4},1)={d[1,2]+Cost(2,{3,4},1)=10+25=35d[1,3]+Cost(3,{2,4},1)=15+25=40d[1,4]+Cost(4,
{2,3},1)=20+23=43=35cost(1,{2,3,4}),1)d[1,2]+cost(2,{3,4},1)=10+25=35d[1,3]+cost(3,{2,4},1)=15+25=40d[1,4]+cost(4,{2,3},1)=20+23=43=35