0% found this document useful (0 votes)
46 views

Huffman Coding PDF

1) Huffman coding is a lossless data compression algorithm that uses variable-length binary codes to encode symbols where the most common symbols are represented using fewer bits than less common symbols. 2) It constructs a binary tree from the frequency of use for each symbol, with more frequent symbols at the top of the tree. 3) The algorithm assigns variable-length binary codes to each symbol, with the codes of more frequent symbols being shorter than less frequent ones, by traversing the tree from the root to the leaves.

Uploaded by

Girish Awasthi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views

Huffman Coding PDF

1) Huffman coding is a lossless data compression algorithm that uses variable-length binary codes to encode symbols where the most common symbols are represented using fewer bits than less common symbols. 2) It constructs a binary tree from the frequency of use for each symbol, with more frequent symbols at the top of the tree. 3) The algorithm assigns variable-length binary codes to each symbol, with the codes of more frequent symbols being shorter than less frequent ones, by traversing the tree from the root to the leaves.

Uploaded by

Girish Awasthi
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

1

Huffman Coding
The Huffman encoding algorithm starts by constructing a list of all the alphabet symbols in
descending order of their probabilities. It then constructs, from the bottom up, a
binary tree with a symbol at every leaf. This is done in steps, where at each step two
symbols with the smallest probabilities are selected, added to the top of the partial tree,
deleted from the list, and replaced with an auxiliary symbol representing the two original
symbols. When the list is reduced to just one auxiliary symbol (representing the entire
alphabet), the tree is complete. The tree is then traversed to determine the code words
of the symbols.

Huffman’s Algorithm:
1. Create a terminal node for each ai with probability p(ai) and let S = the set of terminal
nodes.
2. Select nodes x and y in S with the two smallest probabilities.
3. Replace x and y in S by a node with probability p(x) + p(y). Also, create a node in the tree
which is the parent of x and y.
4. Repeat (2)-(3) until |S| = 1.

Matlab Code:
%%%%%%%%%%%%%%%%%%%%%%% Huffman Coding %%%%%%%%%%%%%%%%%%%%%%%%%%%%

close all;clear all;clc

%% User defined input

String=input('Enter character to encode\n','s') % Message must


consists of character

%% User defined probabilities Probabilities

fprintf('\nRemember !\n')
disp('Allocate probabilities such that')
disp('sum of all probabilities must be 1')
for i=1:length(String)

fprintf('\nProbability of\n')
disp(String(i))
Prob(i)=input('is \n')
end
total_prob=sum(Prob)

%% Encoding Bit Calculation

num_bits = ceil(log2(length(Prob)))

%% Coreresponding Probabilities

disp('Character Probability:');
2

for i = 1:length(Prob)
display(strcat(String(i),' --> ',num2str(Prob(i))));
end

%% Initialize The Encoding Array

for i = 1:length(String)
Str_in_cell{i} = String(i);
end

% Save initial set of symbols and probabilities for later use


init_str = Str_in_cell
init_prob = Prob

%% Huffman Encoding Process

sorted_prob = Prob;
counter = 1;
while (length(sorted_prob) > 1)
% Sort probs
[sorted_prob,indeces] = sort(sorted_prob,'ascend');

% Sort string based on indeces


Str_in_cell = Str_in_cell(indeces);

% Create new symbol


new_node = strcat(Str_in_cell(2),Str_in_cell(1));
new_prob = sum(sorted_prob(1:2));

% Dequeue used symbols from "old" queue


Str_in_cell = Str_in_cell(3:length(Str_in_cell));
sorted_prob = sorted_prob(3:length(sorted_prob));

% Add new symbol back to "old" queue


Str_in_cell = [Str_in_cell, new_node];
sorted_prob = [sorted_prob, new_prob];

% Add new symbol to "new" queue


newq_str(counter) = new_node;
newq_prob(counter) = new_prob;
counter = counter + 1;
end

%% Huffman Tree Data

tree = [newq_str,init_str];
tree_prob = [newq_prob, init_prob];

% Sort all tree elements


[sorted_tree_prob,indeces] = sort(tree_prob,'descend');
sorted_tree = tree(indeces);

%% Calculate Tree Parameters

parent= 0;
3

num_children = 2;
for i = 2:length(sorted_tree)
% Extract my symbol
me = sorted_tree{i};

% Find my parent's symbol (search until shortest match is found)


count = 1;
parent_maybe = sorted_tree{i-count};
diff = strfind(parent_maybe,me);
while (isempty(diff))
count = count + 1;
parent_maybe = sorted_tree{i-count};
diff = strfind(parent_maybe,me);
end
parent(i) = i - count;
end

%% Plot the Huffman Tree

treeplot(parent);
title(strcat('Huffman Coding Tree - "',String,'"'));

%% Console Output - Tree Symbols and Their Probabilities

display(sorted_tree)
display(sorted_tree_prob)

%% Tree Parameter Extraction

[xs,ys,h,s] = treelayout(parent);
%% Label Tree Nodes

text(xs,ys,sorted_tree);

%% Label Tree Edges

for i = 2:length(sorted_tree)
% Get my coordinate
my_x = xs(i);
my_y = ys(i);

% Get parent coordinate


parent_x = xs(parent(i));
parent_y = ys(parent(i));

% Calculate weight coordinate (midpoint)


mid_x = (my_x + parent_x)/2;
mid_y = (my_y + parent_y)/2;

% Calculate weight (positive slope = 1, negative = 0)


slope = (parent_y - my_y)/(parent_x - my_x);
if (slope > 0)
weight(i) = 0;
else
weight(i) = 1;
end
4

text(mid_x,mid_y,num2str(weight(i)));
end

%% Huffman Codebook Calculation

for i = 1:length(sorted_tree)
% Initialize code
code{i} = '';

% Loop until root is found


index = i;
p = parent(index);
while(p ~= 0)
% Turn weight into code symbol
w = num2str(weight(index));

% Concatenate code symbol


code{i} = strcat(w,code{i});

% Continue towards root


index = parent(index);
p = parent(index);
end
end

%% Huffman Codebook

codeBook = [sorted_tree', code']

Huffman Tree:

Huffman Coding Tree - "abcde"


1

0.9
dacbe
0.8
0
0.7
dacb
0.6 0

0.5 dac 1

0.4 1 1

0 ac
0.3
0 1
0.2
d a c b e
0.1

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
height = 4

Figure 1. Huffman tree plot for input "abcde" with user defined probabilities

You might also like