0% found this document useful (0 votes)
32 views

Algo-Ch-3 Greedy Algorithms

Uploaded by

sintebeta
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views

Algo-Ch-3 Greedy Algorithms

Uploaded by

sintebeta
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 42

Algo- Chapter 3: Greedy

Algorithms

By: Elsay M.
Outline for Greedy Algorithms (6hr)

3.1. General Characteristic of Greedy Algorithms


3.2. Graph Minimum Spanning Tree (MST)
-Kruskal’s and Prims’s Algorithms
3.3. Shortest Paths
3.4. Scheduling
Overview:
• This chapter explores the fundamentals of greedy
algorithms, focusing on their general characteristics and
applications in various computational problems.
• Greedy algorithms are iterative processes that make
locally optimal choices at each step with the aim of
finding a global optimum.
• Despite their simplicity, greedy algorithms are powerful
tools for solving optimization problems, especially in
graph theory and scheduling.
3.1 General Characteristics of
Greedy Algorithms
• Definition:
• A greedy algorithm follows a problem-solving heuristic
where the best, or “greediest,” option is selected at
each stage in the hope of finding an optimal overall
solution.
• The key feature of these algorithms is that they make a
series of choices that seem best in the short term,
without considering the broader consequences.
Key Characteristics:
Local Optimal Choice: At each stage of the algorithm, the best
possible decision is made without reconsidering previous decisions.
Optimal Substructure: A problem exhibits an optimal substructure
if an optimal solution to the problem contains optimal solutions to its
subproblems. Greedy algorithms thrive in problems with this
characteristic.
Greedy Choice Property: Greedy algorithms make decisions based
on the hope that choosing the local optimum leads to the global
optimum. This is crucial in problems like Minimum Spanning Trees
(MST) and scheduling.
Non-Backtracking: Greedy algorithms do not revisit past decisions
or correct them later. This property makes them efficient but less
flexible than other approaches like dynamic programming.
3.1. General Characteristics of Greedy Algorithms
Q1. True or False: A greedy algorithm always finds the globally optimal solution by
making a series of locally optimal choices.
• Answer: False -Explanation: Greedy algorithms do not always guarantee a globally
optimal solution, but for certain problems (like Minimum Spanning Tree or Huffman
Coding), they do provide optimal solutions.
Q2. Fill in the blank: A key characteristic of a greedy algorithm is that once a choice is
made, it is never _______.
• Answer: Reconsidered. Explanation: Greedy algorithms make choices based on current
information and never backtrack to reconsider past decisions.
Q3. Multiple Choice: Which of the following is NOT a characteristic of a greedy algorithm?
a) Local optimality b) Greedy choice property c) Overlapping subproblems d) Optimal
substructure
• Answer: c) Overlapping subproblems. Explanation: Overlapping subproblems is a
characteristic of dynamic programming, not greedy algorithms.
Q4. Short Answer: Explain why a greedy algorithm might fail for certain problems. Can
you give an example?
• Answer: Greedy algorithms may fail because they only make locally optimal choices
without considering the bigger picture. An example is the "knapsack problem," where a
greedy algorithm selecting items based on the highest value-to-weight ratio might not
produce the optimal solution. The dynamic programming approach would be required for
the optimal result.
3.2 Graph Minimum Spanning Tree (MST)
• A Minimum Spanning Tree (MST) is a subset of the
edges in a graph that connects all the vertices together
without any cycles and with the minimum possible total
edge weight.
• Greedy algorithms are particularly effective for finding
MSTs in undirected, weighted graphs.
Kruskal’s Algorithm:
• Kruskal's Algorithm is a popular algorithm used to
find the Minimum Spanning Tree (MST) of a graph.
The goal of the MST is to connect all the vertices (or
nodes) in the graph with the minimum possible total
edge weight, without creating any cycles (loops).
• Approach: Kruskal's algorithm is a greedy approach
that builds the MST by considering the edges in
increasing order of their weight.
How Kruskal’s Algorithm Works:
1. Sort all edges: First, list all the edges of the graph and sort them in
increasing order based on their weights (the cost of traveling between two
nodes).
2. Pick the smallest edge: Start with the smallest edge and add it to your
MST, as long as adding that edge doesn’t form a cycle.
3. Repeat: Keep adding the next smallest edge that doesn’t form a cycle, until
all vertices are connected.
4. Result: The result is a spanning tree that connects all nodes with the least
total weight.
Efficiency: Kruskal’s algorithm is efficient for sparse graphs and often uses a
union-find data structure to handle cycle detection efficiently.

Kruskal's algorithm works by picking the smallest available edge and making
sure no cycles are formed, ensuring that all nodes are connected with the least
cost.
Example:
• Let’s say we have 4 nodes: A, B, C, and D, and these are connected by edges
with the following weights:
• A—B: 4, A—C: 1, B—C: 2, B—D: 5, C—D: 3
• Steps for Kruskal's Algorithm:
• List edges: First, list all edges and their weights:
• A—C: 1, B—C: 2, C—D: 3, A—B: 4, B—D: 5
• Sort edges by weight:
• A—C: 1, B—C: 2, C—D: 3, A—B: 4, B—D: 5
• Pick the smallest edge: Start by adding the smallest edge, A—C with
weight 1.
• Next smallest edge: Add B—C with weight 2. No cycle is formed.
• Next smallest edge: Add C—D with weight 3. No cycle is formed.
• Stop: Now all nodes (A, B, C, and D) are connected. Adding A—B or B—D
would form a cycle, so we stop here.
• Result: The Minimum Spanning Tree consists of the edges A—C, B—C, and
C—D, with a total weight of 1 + 2 + 3 = 6.
Prim’s Algorithm:
• Prim’s Algorithm is another popular algorithm to find the
Minimum Spanning Tree (MST) of a graph, which connects all
vertices (or nodes) with the least total edge weight without
forming any cycles.
• The main difference between Prim’s Algorithm and Kruskal’s
Algorithm is that Prim’s starts from a single node and grows
the MST by adding the nearest vertex to the existing tree, while
Kruskal’s algorithm adds edges in order of weight.

• Approach: Prim's algorithm builds the MST by expanding from a


single vertex, continuously adding the nearest vertex to the
growing MST.
How Prim's Algorithm Works:
1. Start with any node: Choose any node as the starting point
for your MST.
2.Add the nearest node: Find the edge with the smallest
weight that connects a node in the current MST to a node
outside it. Add this edge and the new node to the MST.
3.Repeat: Keep adding the nearest node (connected by the
smallest edge) until all vertices are part of the MST.
4.Result: The result is a minimum spanning tree that connects
all nodes with the least total weight.
Efficiency: Prim’s algorithm is efficient for dense graphs and
can be implemented with priority queues to speed up the
selection of the minimum-weight edge.
Example:
• Let’s use the same graph from the Kruskal example, with 4 nodes: A, B, C, and D,
and these edges:
• A—B: 4, A—C: 1, B—C: 2, B—D: 5, C—D: 3
Steps for Prim's Algorithm:
• Start with a node: Let’s start with node A.
• Find the smallest edge from A:
• A—C has weight 1, so add edge A—C to the MST.
• Find the smallest edge connected to A or C:
• The edges now connected to A or C are B—C (weight 2) and C—D (weight 3).
• The smallest edge is B—C, with weight 2, so add edge B—C to the MST.
• Find the smallest edge connected to A, B, or C:
• The remaining edges connected are C—D (weight 3) and B—D (weight 5).
• The smallest edge is C—D, with weight 3, so add edge C—D to the MST.
• Stop: Now all nodes (A, B, C, and D) are connected.
• Result:
• The Minimum Spanning Tree includes the edges A—C, B—C, and C—D, with a
total weight of 1 + 2 + 3 = 6.
Summary:

• Prim’s algorithm starts with a single node and


repeatedly adds the smallest edge that connects a new
node to the tree.
• It gradually builds the MST by adding the nearest node
to the growing tree, ensuring the least total cost.
• This method is especially useful when the graph is
dense (has many edges), as Prim's tends to be faster in
such cases.
3.2. Graph Minimum Spanning Tree (MST)
Kruskal’s and Prim’s Algorithms
Q1. Multiple Choice: Kruskal's algorithm primarily relies on:
a) Depth-first search b) Sorting edges by weight
c) A queue to select the next node d) Dynamic programming
• Answer: b) Sorting edges by weight. Explanation: Kruskal’s algorithm sorts all edges
by weight and adds them to the Minimum Spanning Tree if they don’t form a cycle.
Q2. True or False: In Prim’s algorithm, we always start with the smallest edge and
grow the spanning tree one edge at a time.
• Answer: False. Explanation: Prim's algorithm starts from a single vertex and adds
the smallest edge connecting the tree to a vertex not yet in the tree, but the starting
edge doesn’t have to be the smallest overall.
Q3. Short Answer: Compare and contrast Kruskal’s and Prim’s algorithms. In what
types of graphs is one preferred over the other?
• Answer: Kruskal’s algorithm is edge-centric and better for sparse graphs, as it
selects the smallest edges and checks for cycles. Prim’s algorithm is vertex-centric,
starting from any vertex and growing the MST by adding the smallest edge. Prim’s is
preferred for dense graphs since it uses a priority queue to find the next closest
vertex efficiently.
3.3 Shortest Paths
• In this section, we consider greedy algorithms for solving
shortest path problems in graphs.
• When we talk about Shortest Paths, we refer to finding
the least-cost or least-distance path between two points
(or nodes) in a network. This is especially useful in fields
like networking (routing data through servers) or mapping
(finding the quickest route between locations).
• Unlike the MST problem, the shortest path problem finds
the minimum path between two specific vertices, or from
one vertex to all others in a weighted graph.
Dijkstra’s Algorithm:
• Dijkstra’s algorithm is a greedy algorithm used to find the
shortest paths from a single source vertex to all other vertices
in a graph with non-negative edge weights.
• Steps:
• Start from the source vertex and assign it a tentative
distance of 0, with all other vertices initially set to infinity.
• Explore the nearest vertex (with the smallest tentative
distance), and update the distances of its neighboring
vertices.
• Repeat this process for the next nearest vertex until all
vertices have been processed.
• Efficiency: Dijkstra’s algorithm is efficient when combined
with a priority queue to find the next closest vertex.
Example:
• Imagine you have a graph of cities connected by roads, with
each road labeled with its distance. If you start from City A
and want to find the shortest path to all other cities,
Dijkstra's Algorithm would start by visiting neighboring cities
first and updating their shortest distance from City A,
repeating this process for each city until you've found the
shortest route to every other city.
• Start from A.
• Nearest neighbors are B (2) and D (1).
• Update shortest distances for B and D.
• Move to D (since it has the smaller distance of 1),
check neighbors, and continue updating.
• Repeat until all cities are visited.
Bellman-Ford Algorithm (Greedy Variant):
• The Bellman-Ford Algorithm also finds the shortest
paths from a source node to all other nodes in a graph,
but unlike Dijkstra’s Algorithm, it can handle graphs
where edge weights can be negative (meaning you can
"gain" distance or cost along certain paths).
• While Bellman-Ford is typically considered a dynamic
programming algorithm, a greedy variant can be used
when negative weights are absent.
• The algorithm computes the shortest paths by iterating
through all edges and updating distances in a greedy
manner, but its time complexity makes it less efficient
than Dijkstra’s algorithm.
How it works:

• Initialize distances from the source node to infinity,


except the source itself, which is 0.
• For each edge in the graph, update the shortest known
distance to each neighboring node. This is done V-1
times (where V is the number of nodes).
• After V-1 passes, check once more to detect any
negative-weight cycles (where the total distance
decreases indefinitely due to looping).
Example:
• Imagine you have a network of computers, and some
connections between them may have negative weights due
to data optimizations (less cost for transferring data in
certain directions). Bellman-Ford helps you find the shortest
paths while considering these negative weights.
• Start from A.
• Bellman-Ford updates the distances for
every edge multiple times to ensure all
shortest paths are considered.
• It can detect negative cycles
(like looping between B and E endlessly,
decreasing the cost).
Key Differences:
• Dijkstra's Algorithm is faster but works only for non-
negative weights.
• Bellman-Ford can handle graphs with negative
weights but takes longer since it updates paths multiple
times.
• Both algorithms are essential tools in network
routing, navigation systems, and resource
optimization scenarios.
3.4 Scheduling
• Scheduling is another domain where greedy algorithms
excel. In scheduling problems, tasks need to be assigned
to resources (such as machines or time slots) in an
optimal way.
• Greedy algorithms can provide effective solutions in
specific types of scheduling problems. It is often used in
scheduling problems where you need to select or
prioritize tasks based on certain criteria.
• Let's look at three examples: Job Scheduling with
Deadlines, Interval Scheduling, and Huffman Coding
for data compression.
Job Scheduling with Deadlines:
• This problem involves scheduling jobs such that
deadlines are met and profit is maximized.
o Greedy Approach: Sort the jobs by their profit in descending
order and assign them to the closest available time slot before
their deadline.
How it works:
• Sort jobs in descending order by profit.
• Starting from the highest profit, assign each job to the
latest available time slot before its deadline (if available).
• If no such time slot is available, skip the job.
• Example of Job Scheduling with Deadlines:
Job Deadline Profit
A 2 100
B 1 19
C 2 27
D 1 25

Q. Imagine you have 4 jobs with the above properties:


• Sort by profit: A, C, D, B
• Start with job A (profit 100), schedule it at the latest slot before its deadline, which is slot
2.
• Next, schedule job C (profit 27) at slot 1 (latest before its deadline).
• Job D (profit 25) and B (profit 19) both have deadlines of 1, but since slot 1 is already
Slot Job Profit
filled, we skip both.
• FINAL SCHEDULE: 1 C 27
2 A 100
Interval Scheduling:
• Interval Scheduling is about selecting the maximum number of
non-overlapping intervals (tasks or jobs) from a set of intervals.
Each task has a start and end time, and the goal is to choose as
many tasks as possible without any two overlapping.
o Greedy Approach: Choose the task that finishes the earliest (i.e., has
the smallest finish time), and keep selecting tasks that start after the
previously selected task finishes. This approach ensures that the
maximum number of non-overlapping tasks are scheduled.
How it works:
• Sort the intervals by their end times.
• Pick the interval with the earliest finish time and add it to your
schedule.
• Move to the next interval that starts after the current one
finishes, repeating the process.
Example: Suppose you have five
tasks: Task Start End
A 1 4
B 3 5
C 0 6
D 5 7
E 8 9

• Sort by end time: A, B, D, E, C


• Choose task A (ends at time 4). final Task
schedule:
Start End
• Next, select D A 1 4
(starts after A ends, finishes at time 7). D 5 7
E 8 9
• Lastly, select E
(starts after D ends, finishes at time 9).
This way, you maximize the number of non-overlapping intervals. In this
case, we can fit 3 tasks.
Huffman Coding (Data
Compression):
• Huffman coding uses a greedy algorithm to assign variable-length
codes to input characters, with shorter codes assigned to more
frequent characters. This is used in data compression schemes.
• The goal is to minimize the total number of bits required to
represent the data.
How it works:
• Count the frequency of each character in the input.
• Create a priority queue where each character is a node, prioritized by
its frequency.
• Repeatedly combine the two lowest-frequency nodes into a new node
whose frequency is the sum of the two. This process builds a binary
tree.
• Assign binary codes (0 and 1) to each character based on the path
from the root of the tree
Example: Suppose you have the following character
frequencies:
Character Friquency
A 5
B 9
C 12
D 13
E 16
F 45
• Start by combining A (5) and B (9), forming a new node with
frequency 14.
• Then, combine C (12) and D (13), forming a new node with
frequency 25.
• Next, combine the two smallest nodes: A+B (14) and C+D (25),
forming a new node with frequency 39.
• Finally, combine E (16) with F (45), and the result with A+B+C+D.
Step-by-Step Explanation of
Huffman Coding:
1. List the Characters and Frequencies
• We start with the following characters and their
frequencies:
o A: 5, B: 9, C: 12, D: 13, E: 16, F: 45
2. Build a Min-Heap or Priority Queue
o We insert each character and its frequency into a priority
queue (or min-heap), which will always keep the lowest
frequency nodes at the front.
o So initially, we have: A(5), B(9), C(12), D(13), E(16), F(45)
Cont.
3. Build the Huffman Tree
• To build the Huffman tree, we combine the two smallest
frequency nodes step by step until only one node
remains (the root of the tree). Here’s how we do it:
• Step 1: Take the two smallest frequencies, A(5) and B(9).
Combine them into a new node with a frequency of 5 + 9 =
14. This new node becomes the parent of A and B.
 New Tree: (A+B) = 14
 Remaining: C(12), D(13), E(16), F(45), (A+B)(14)
• Step 2: Now, take the two smallest remaining frequencies:
C(12) and D(13). Combine them into a new node with a
frequency of 12 + 13 = 25. This new node becomes the parent
of C and D.
 New Tree: (C+D) = 25
 Remaining: E(16), F(45), (A+B)(14), (C+D)(25)
Build the Huffman Tree
o Step 3: Next, take the two smallest nodes: (A+B)(14) and
E(16). Combine them into a new node with a frequency of 14
+ 16 = 30. This new node becomes the parent of (A+B) and E.
 New Tree: ((A+B)+E) = 30
 Remaining: F(45), (C+D)(25), ((A+B)+E)(30)
o Step 4: Now, take the two smallest nodes: (C+D)(25) and
((A+B)+E)(30). Combine them into a new node with a
frequency of 25 + 30 = 55. This new node becomes the
parent of (C+D) and ((A+B)+E).
 New Tree: ((C+D)+((A+B)+E)) = 55
 Remaining: F(45), ((C+D)+((A+B)+E))(55)
o Step 5: Finally, combine the last two nodes: F(45) and ((C+D)
+((A+B)+E))(55). The frequency of the new root node is 45 +
55 = 100, which becomes the root of the entire Huffman tree.
o Final Tree: ((F) + ((C+D)+((A+B)+E))) = 100
Cont.

[100]
/ \
F(0)[45] [55]
/ \
[25] [30]
/ \ / \
C(110) D(111) [14] E(101)
/ \
A(1000) B(1001)
Huffman Tree Example:
• Start by combining A (5) and B (9), forming a new node with frequency 14.
• Then, combine C (12) and D (13), forming a new node with frequency 25.
• Next, combine the two smallest
nodes: A+B (14) and C+D (25),
forming a new node with frequency 39.
• Finally, combine E (16) with F (45),
and the result with A+B+C+D.
• Character F gets the shortest code
since it has the highest frequency.
• After assigning binary codes from the tree,
you get variable-length codes that compress the data efficiently.
• This method is used in file compression formats like ZIP and JPEG.
Example 2
• Example 1: Compressing a Short String
• Suppose you want to compress the string ABBCCCDDDD.
First, calculate the frequency of each character:
• A: 1 occurrence, B: 2 occurrences, C: 3 occurrences, D: 4
occurrences
• Step 1: Build a Min-Heap (Priority Queue)
• Insert each character with its frequency into a min-heap:
• A (1), B (2), C (3), D (4)
• Step 2: Build the Huffman Tree
• Traverse left from the root: 0
• Traverse right from the root: 1
Cont..
• Combine the two
smallest elements until
one tree remains:
 Combine A (1) and
B (2): cost = 1 + 2
= 3 (new node)
 Combine C (3) and
(A+B) (3): cost = 3
+ 3 = 6 (new
node)
 Combine D (4) and
(C+(A+B)) (6):
cost = 4 + 6 = 10
(root node)
Cont.
• D has a code of 0 because it's on the left of the root.
• C has a code of 10 (left-right).
• A has a code of 110 (left-right-right).
• B has a code of 111 (left-right-right-right).
• Step 3: Assign Huffman Codes
§ From the tree, assign binary codes: A: 110, B: 111, C: 10, D: 0
§ Step 4: Encode the String
§ Original string: ABBCCCDDDD
§ A → 110, B → 111, C → 10, D → 0
§ Encoded string: 110 111111 101010 0000
§ This compressed representation is much shorter than
using fixed-length encoding (e.g., 2 bits per character).
Example 2: Compressing a More Complex String

• Consider the string HELLOHUFFMAN. First, calculate the frequency of


each character:
• H: 2 occurrences, E: 1 occurrence, L: 2 occurrences, O: 1 occurrence, U: 1
occurrence, F: 2 occurrences, M: 1 occurrence, N: 1 occurrence
Step 1: Build a Min-Heap
• Insert each character with its frequency into the heap:
• E (1), O (1), U (1), M (1), N (1), H (2), L (2), F (2)
Step 2: Build the Huffman Tree
• Combine the smallest elements:
• Combine E (1) and O (1): new node = 2
• Combine U (1) and M (1): new node = 2
• Combine N (1) and (E+O) (2): new node = 3
• Combine (U+M) (2) and F (2): new node = 4
• Combine H (2) and L (2): new node = 4
• Combine (N+(E+O)) (3) and (U+M+F) (4): new node = 7
• Combine (H+L) (4) and (N+E+O+U+M+F) (7): root node = 11
• Step 3: Assign Huffman Codes
• From the tree, assign binary codes:
• H: 00, E: 1011, L: 01, O: 1010, U: 1000, F: 11, M: 1001, N: 110
• Step 4: Encode the String
• Original string: HELLOHUFFMAN
• H → 00, E → 1011, L → 01, O → 1010, H → 00, U → 1000, F → 11, F →
11, M → 1001
• A → not applicable (since "A" isn't part of the input string and thus
wouldn't have a code in this case).
• Encoded string: 00 1011 01 01 1010 00 1000 11 11 1001
• These examples show how Huffman Coding compresses data by
assigning shorter codes to more frequent characters.
Summary of Key Concepts in Scheduling :
• Job Scheduling with Deadlines: Schedule jobs to maximize
profit while respecting deadlines.
• Interval Scheduling: Select the maximum number of non-
overlapping intervals (tasks) based on their end times.
• Huffman Coding: Compress data by assigning shorter
binary codes to more frequent characters, minimizing the
total number of bits.
• All these examples use the greedy approach: make the best
immediate decision and hope that it leads to the optimal
solution!
Summary:

Greedy algorithms are powerful for solving optimization


problems efficiently. However, they are not always guaranteed
to find the global optimum in every problem.

The success of a greedy algorithm depends on whether the


problem has properties like optimal substructure and the
greedy choice property.

Algorithms like Kruskal’s, Prim’s, and Dijkstra’s demonstrate


the strength of the greedy approach in graph theory, while
greedy scheduling methods highlight its practical applications
in real-world problems.
Summary of Key Unsolved Problems
Dynamic Graph MST – How to efficiently maintain the MST in graphs
that change over time.
Handling Uncertainty in MST – Adapting MST algorithms to work with
uncertain or fluctuating edge weights.
Negative Weights in Shortest Path – Developing efficient algorithms
for graphs with negative weight cycles.
Multi-Objective Shortest Path – Optimizing shortest paths when
multiple criteria must be considered.
Job Scheduling with Dependencies – Handling complex task
dependencies and resource constraints in scheduling.
Online Scheduling – Making efficient scheduling decisions in real-time
with incomplete knowledge of future tasks.
Fairness in Scheduling – Balancing efficiency and fairness when
scheduling tasks in competitive environments.

You might also like