0% found this document useful (0 votes)
178 views16 pages

Seminar Persentation: Upgma

The document summarizes the UPGMA (Unweighted Pair Group Method with Arithmetic Mean) method for phylogenetic tree construction. UPGMA is a distance-based method that generates rooted trees from a distance matrix. It works by sequentially clustering pairs of taxa based on the shortest distance between them. At each step, the two closest clusters are merged into a new cluster, and distances to the new cluster are recalculated using averages. The example shows the step-by-step process of applying UPGMA to a sample distance matrix.

Uploaded by

agron emiway
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
178 views16 pages

Seminar Persentation: Upgma

The document summarizes the UPGMA (Unweighted Pair Group Method with Arithmetic Mean) method for phylogenetic tree construction. UPGMA is a distance-based method that generates rooted trees from a distance matrix. It works by sequentially clustering pairs of taxa based on the shortest distance between them. At each step, the two closest clusters are merged into a new cluster, and distances to the new cluster are recalculated using averages. The example shows the step-by-step process of applying UPGMA to a sample distance matrix.

Uploaded by

agron emiway
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 16

SEMINAR PERSENTATION

TOPIC:

UPGMA

Submitted To: Submitted By:


   
Dr. KOEL MUKHERJEE INDRANIL SARMAH -(MCA/10011/19)
KUSHAGRA GUPTA - (MCA/10026/19)
Assistant Professor
SHUBHAM PATIDAR -(MCA/10037/19)
JAYDEEP KUMAR SILAWAT -(MCA/10051/19)
SWAGAT SONEWANE -(MCA/10024/19)
Phylogenetic tree construction
2 methods

• Distance-based methods –

Examples : UPGMA, Neighbor joining, Fitch-Margoliash method, minimum evolution

• Character-based methods –

Input: Aligned sequences

Output: Phylogenetic tree

Examples : Parsimony ,
Maximum Likelihood
UPGMA
UPGMA : Unweighted Pair Group Method with Arithmetic Mean
Developed by Sokal and Michener in 1958.
It is a Sequential clustering method
Type of distance based method for Phylogenetic Tree construction
UPGMA is the simplest method for constructing trees.
Generates rooted trees
 Generates ultra metric trees from a distance matrix
Uses a simplest algorithm
Input: Distance matrix containing pairwise statistical estimation of
aligned sequences

Output: Phylogenetic tree


UPGMA Algorithm
• UPGMA starts with a matrix of pairwise distances.

• Each sample is denoted as a 'cluster'.

• Assigns all clusters to a star-like tree.

• The algorithm constructs a rooted tree that reflects the structure present in a
pairwise similarity matrix.

• At each step, the nearest two clusters are combined into a higher-level
cluster.

• It assumes an ultra-metric tree in which the distances from the root to


every branch
tip are equal.
Steps
Find the i and j with the smallest distance Dij.
Create a new group (ij) which has n(ij) = ni + nj members.
Connect i and j on the tree to a new node (ij).
Give the edges connecting i to (ij) and j to (ij) same length so that the depth of group
(ij) is Dij/2.
Compute the distance between the new group and all other groups except i and j by
using

𝐷 Dik +𝐷 𝑗𝑘
𝑖𝑗 , 𝑘 = 2

Delete columns and rows corresponding to i and j and add one for (ij). If there are
two or more groups left, go back to the first step
Computational tools
• MEGA
• PHYLIP
• MVSP
• MVSP87
• SAS
• SYN-TAX
• NTSYS
• DendroUPGMA
Advantages

 simple algorithm
 Fastest method
 easy to compute by hand or a variety of software
 Trees reflect phenotypic similarities by phylogenetic distances
 Data can be arranged in random order prior to analysis
 Rooted trees are generated that are easy to analyze
Disadvantages

 It assumes the same evolutionary speed on all lineages


 It frequently generates wrong tree topologies
 Re-rooting is not allowed
 Algorithm does not aim to reflect evolutionary
descent
 It assumes a randomized molecular clock.
Application
s
• In ecology, it is one of the most popular methods for the classification
of sampling units (such as vegetation plots) on the basis of their
pairwise
similarities in relevant descriptor variables (such as species
composition).[3]
• In bioinformatics, UPGMA is used for the creation of phenetic trees
(phenograms). UPGMA was initially designed for use in protein
electrophoresis studies, but is currently most often used to produce
guide trees for more sophi sticated algorithms. This algorithm is for
example
used in sequence alignment procedures, as it proposes one order in
which the sequences will
be aligned. Indeed, the guide tree aims at grouping
the most similar sequences, regardless of their evolutionary rate or
phylogenetic affinities, an d that is exactly the goal of UPGMA.[4]
• In phylogenetics, UPGMA assumes a constant rate of evolution
(molecular clock hypothesis), and is not a wellregarded method for
Example
1. Calculate the pairwise distance matrix

A B C D E F
A 0 1 3 6 7 10
B 1 0 3 6 7 10
C 3 3 0 5 6 9
D 6 6 5 0 1 7
E 7 7 6 1 0 8
F 10 10 9 7 8 0
2. Group the 2 most closely related sequences

A B C D E F

A 0 1 3 6 7 10 0.5
A
B 1 0 3 6 7 10
0.5
C 3 3 0 5 6 9 B

D 6 6 5 0 1 7

E 7 7 6 1 0 8

F 10 10 9 7 8 0
3. Recalculate the distance matrix and take the next
smallest distance

A/B C D E F

A/B 0 3 6 7 10 0.5
A
C 3 0 5 6 9
0.5 B
D 6 5 0 1 7

E 7 6 1 0 8 0.5
D
F 10 9 7 8 0
0.5 E
3. Recalculate the distance matrix and take the next
smallest distance

A/B C D/E F

0.5
A/B 0 3 6.5 10 A
1
C 3 0 5.5 9 0.5 B

D/E 6.5 5.5 0 7.5


1.5
C
F 10 9 7.5 0

0.5
D

0.5 E
3. Recalculate the distance matrix and take the next
smallest distance

A/B/C D/E F

0.5
A/B/C 0 6 9.5 A
1

D/E 6 0 7.5 0.5 B


1.5

F 9.5 7.5 0 1.5


C

0.5
D
2.5
0.5 E
3. Recalculate the distance matrix and take the next
smallest distance

A/B/C/D/E F 0.5
A
A/B/C/D/E 0 8.5 1
0.5
1.5 B
F 8.5 0
1.5
C
1.25

0.5
D
2.5

0.5
E

4.25
F

You might also like