Lecture 6
Lecture 6
December 3, 2012
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
1 Introduction
Introduction
2 Basic Definitions
Graph theory
Alphabet and strings
3 Dictionaries
Trie
Patricia tree
4 Suffix tree
Suffix trie
Suffix tree
Ukkonen’s algorithm
5 Example
6 Overview
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Contents
1 Introduction
2 Basic Definitions
3 Dictionaries
4 Suffix tree
5 Example
6 Overview
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Introduction
Introduction
1 String matching
2 Indexing and querying
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Introduction
Introduction
1 String matching
2 Indexing and querying
Introduction
Knuth-Morris-Pratt algorithm
Boyer-Moore, Boyer-Moore-Horspool, Turbo-Boyer-Moore,
etc.
Aho-Corasick
...
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Introduction
Indexing - 1
Problem
Given a text T , we need to construct an efficient data structure D
which will serve as an index of T , so that we can efficiently query
text T .
Introduction
Indexing - 2
Introduction
Indexing - 2
Contents
1 Introduction
2 Basic Definitions
3 Dictionaries
4 Suffix tree
5 Example
6 Overview
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Graph theory
Graph
A graph is a pair G = (V , E ) of sets such that E ⊆ V × V .
2 3
1 4
6 5
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Graph theory
Graph
A graph is a pair G = (V , E ) of sets such that E ⊆ V × V .
2 3
1 4
6 5
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Graph theory
Graph
A graph is a pair G = (V , E ) of sets such that E ⊆ V × V .
Path
A path of length n in a graph G = (V , E ) is a sequence
v0 , v1 , . . . vn ∈ V such that (v0 , v1 ), (v1 , v2 ), . . . , (vn−1 , vn ) ∈ E .
2 3
1 4
6 5
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Graph theory
Graph
A graph is a pair G = (V , E ) of sets such that E ⊆ V × V .
Path
A path of length n in a graph G = (V , E ) is a sequence
v0 , v1 , . . . vn ∈ V such that (v0 , v1 ), (v1 , v2 ), . . . , (vn−1 , vn ) ∈ E .
2 3
1 4
6 5
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Graph theory
Graph
A graph is a pair G = (V , E ) of sets such that E ⊆ V × V .
Path
A path of length n in a graph G = (V , E ) is a sequence
v0 , v1 , . . . vn ∈ V such that (v0 , v1 ), (v1 , v2 ), . . . , (vn−1 , vn ) ∈ E .
Cycle
A path v0 , v1 , . . . vn , v0 , where n ≥ 2, is called a cycle.
2 3
1 4
6 5
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Graph theory
Graph
A graph is a pair G = (V , E ) of sets such that E ⊆ V × V .
Path
A path of length n in a graph G = (V , E ) is a sequence
v0 , v1 , . . . vn ∈ V such that (v0 , v1 ), (v1 , v2 ), . . . , (vn−1 , vn ) ∈ E .
Cycle
A path v0 , v1 , . . . vn , v0 , where n ≥ 2, is called a cycle.
2 3
1 4
6 5
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Graph theory
Tree
A rooted tree is an acyclic graph T = (V , E ) with a special vertex
v ∈ V called the root. Nodes with degree 1 are called leaves.
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Definition (Alphabet)
An alphabet Σ is a finite non-empty set whose elements are called
letters.
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Definition (Alphabet)
An alphabet Σ is a finite non-empty set whose elements are called
letters.
Definition (String)
A string on an alphabet Σ is a finite, possibly empty, sequence of
elements of Σ.
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Definition (Alphabet)
An alphabet Σ is a finite non-empty set whose elements are called
letters.
Definition (String)
A string on an alphabet Σ is a finite, possibly empty, sequence of
elements of Σ.
The zero-letter sequence is called the empty string, and is denoted
by ε.
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Definition (Alphabet)
An alphabet Σ is a finite non-empty set whose elements are called
letters.
Definition (String)
A string on an alphabet Σ is a finite, possibly empty, sequence of
elements of Σ.
The zero-letter sequence is called the empty string, and is denoted
by ε.
The set of all possible strings on the alphabet Σ is denoted by Σ∗ .
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Definition (Alphabet)
An alphabet Σ is a finite non-empty set whose elements are called
letters.
Definition (String)
A string on an alphabet Σ is a finite, possibly empty, sequence of
elements of Σ.
The zero-letter sequence is called the empty string, and is denoted
by ε.
The set of all possible strings on the alphabet Σ is denoted by Σ∗ .
Definition (Length of string)
The length of a string x is defined as the length of the sequence
associated with the string x , and is denoted by |x |.
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Contents
1 Introduction
2 Basic Definitions
3 Dictionaries
4 Suffix tree
5 Example
6 Overview
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Trie
Trie
Retrieval
Construct a dictionary for the set of words
{amy, andy, ann, rob, roger, ben, betty}
a A r
b
B C D
m n
e o
E F J M
d n n t b g
y
G H I K L N O
y t e
P Q S
y r
R T
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Trie
Trie
Retrieval
Construct a dictionary for the set of words
{amy, andy, ann, rob, roger, ben, betty}
a A r
b
B C D
m n
e o
E F J M
d n n t b g
y
G H I K L N O
$ y $ $ t $ e
P Q S
$ y r
R T
$ $
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Patricia tree
Patricia tree
1 Construct a trie
2 Remove nodes with out-degree 1 and concatenate the labels
of the corresponding edges to one edge
a A r
b
B C D
m n
e o
E F J M
d n n t b g
y
G H I K L N O
y t e
P Q S
y r
R T
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Patricia tree
Patricia tree
1 Construct a trie
2 Remove nodes with out-degree 1 and concatenate the labels
of the corresponding edges to one edge
a A r
b
B C D
m n
e o
E F J M
d n n t b g
y
G H I K L N O
y t e
P Q S
y r
R T
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Patricia tree
Patricia tree
1 Construct a trie
2 Remove nodes with out-degree 1 and concatenate the labels
of the corresponding edges to one edge
a A
ro
B be
n
my
F J M
n n b
dy
G I K N ger
tty
R T
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Contents
1 Introduction
2 Basic Definitions
3 Dictionaries
4 Suffix tree
5 Example
6 Overview
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Suffix trie
Suffix trie
Example
Given the t = banana$, the set Suff(t) is
Suffix trie
$ n
a a
6
$ n
a n
5
$ n
a a
4
a n $
3
$ a
2
$
1
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Suffix tree
Suffix tree
Definition
A suffix tree is a patricia tree of the suffix trie.
Construction
1 Construct a suffix trie of text x
Suffix tree
a n
b
$ n
a a
6
$ n
a n
5
$ n
a a
4
a n $
3
$ a
2
$
1
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Suffix tree
a n
b
$ n
a a
6
$ n
a n
5
$ n
a a
4
a n $
3
$ a
2
$
1
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Suffix tree
a
na
$
na
6
$
5
banana$ na$
$
4
na$
1
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Suffix tree
Theorem
A suffix tree consists of at most 2n − 1 nodes (or 2n if empty suffix
$ is taken into account).
Suffix tree
Suffix tree
Ukkonen’s algorithm
Definition
An implicit suffix tree for string x is a tree obtained from the suffix
tree of x by
1 Removing $ from all edge labels
2 Removing any edge that has no label
3 Removing any node with only one child
a na a na
6 5 3 6 5 3
$ na$ na
4 2 1 4 2 1
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Ukkonen’s algorithm
Definition
An implicit suffix tree for string x is a tree obtained from the suffix
tree of x by
1 Removing $ from all edge labels
2 Removing any edge that has no label
3 Removing any node with only one child
a na a na
6 5 3 3
$ na$ na
4 2 1 2 1
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Ukkonen’s algorithm
Definition
An implicit suffix tree for string x is a tree obtained from the suffix
tree of x by
1 Removing $ from all edge labels
2 Removing any edge that has no label
3 Removing any node with only one child
a na
nana
anana
$ na banana$ $ na$ banana
6 5 3 3
$ na$
4 2 1 2 1
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Ukkonen’s algorithm
Ukkonen’s algorithm
Ukkonen’s algorithm
Ukkonen’s algorithm
1 Start with T = I1 .
2 Consecutively update T to I2 , I3 , . . . , In+1 in n phases, where
Ii represents the implicit suffix tree of prefix y [1 . . i].
Phase i + 1 updates T from Ii (with all suffixes of y [1 . . i]) to
Ii+1 (with all suffixes of y [1 . . i + 1]).
Each phase i + 1 consists of extensions j = 1, 2, . . . , i + 1 (one for
each suffix of y [1 . . i + 1]).
Extension j ensures that suffix y [j . . i + 1] is in Ii+1 .
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Ukkonen’s algorithm
Ukkonen’s algorithm
Complexity
Complexity
The so-far presented algorithmic approach runs in O(n3 ).
Proof
Consider a single phase i + 1.
Each extension rule can be applied in O(1) ⇒ Applying all i + 1
extensions takes time Θ(i).
Locating the ends of string paths y [1 . . i], . . . , y [i] by traversing
the edge labels takes time Σik=1 = Θ(i 2 ).
⇒ Therefore, the total time for all phases i = 1, 2, . . . , n is
Σni=1 i 2 = Θ(n3 )
Which is even worse than the naive algorithm which runs in O(n2 ).
We will see how this approach, with the use of some simple tricks,
can achieve linear run-time.
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Ukkonen’s algorithm
Suffix links
Ukkonen’s algorithm
bxac xa
c a
3 6
c bxac c bxac
5 2 4 1
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Ukkonen’s algorithm
Ukkonen’s algorithm
Observation
If an internal node v is created during extension j (of phase i + 1),
then extension j + 1 will find out the node s(v ).
Let v be labeled x α
Node v can only be created by extension Rule 2.
That is, v is inserted at the end of path y [j . . i], which continued
by some character c 6= y [i + 1].
⇒ Therefore, paths x αc and αc have been entered before phase
i + 1.
⇒ in extension j + 1, node s(v ) is either found or created at the
end of path α = y [j + 1 . . i].
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Ukkonen’s algorithm
Ukkonen’s algorithm
Ukkonen’s algorithm
Ukkonen’s algorithm
α
xα
s(v )
a
v
bc
abcd
d
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Ukkonen’s algorithm
Skip/Count trick
In phase i + 1, each path y [j . . i], which is followed in extension j,
is known to exist in the tree
⇒ The path can be followed by choosing the correct edges, instead
of examining every character
Let y [k] be the next character to be matched on path y [j . . i]
Now an edge labeled by y [p . . q] can be traversed simply by
checking that y [p] = y [k], and skipping the next q − p characters
of y [j . . i]
⇒ The time to traverse a path is proportional to the number of
nodes on the path (instead of its string length)
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Ukkonen’s algorithm
Lemma
For any node v with a suffix link to s(v ), it holds that
Sketch of proof
The suffix links for any ancestor of v lead to distinct ancestors of
s(v ).
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Ukkonen’s algorithm
Theorem
Using suffix links and the skip/count trick, a single phase i takes
time O(n)
Proof
There are i + 1 ≤ n + 1 extensions in phase i + 1
In any extension, other work except tree-traversal (that is,
extension rules) takes O(1) time only
How to bound the work for traversing the tree?
To find the end of the next path, an extension first moves at most
one level up. Then a suffix link may be followed, which is followed
by a down-traversal to match the rest of the path
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Ukkonen’s algorithm
Improvement
Since there are n phases, the total run-time is O(n2 )
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Ukkonen’s algorithm
Ukkonen’s algorithm
Ukkonen’s algorithm
Eliminating extensions
⇒ ji+1 ≥ ji
Ukkonen’s algorithm
Ukkonen’s algorithm
Theorem
Ukkonen’s algorithm builds the suffix tree for y [1 . . n] in time
O(n), when implemented using the mentioned tricks.
Proof
Ukkonen’s algorithm
Ukkonen’s algorithm
Finally, the implicit suffix tree In+1 can be converted to the true
suffix tree of y [1 . . n]$ in the following way
Ukkonen’s algorithm
Ukkonen’s algorith
Contents
1 Introduction
2 Basic Definitions
3 Dictionaries
4 Suffix tree
5 Example
6 Overview
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10
Phase 1
y = a b c a b x a b c $
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Explicit -
y = a b c a b x a b c $ Rule 2
(1, e)
1
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10
Phase 2
y = a b c a b x a b c $
(1, e)
1
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10
Implicit
y = a b c a b x a b c $
(1, e)
1
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10
Explicit
y = a b c a b x a b c $
(1, e)
(2, e)
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10
Phase 3
y = a b c a b x a b c $
(1, e)
(2, e)
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10
Implicit
y = a b c a b x a b c $
(1, e)
(2, e)
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10
Implicit
y = a b c a b x a b c $
(1, e)
(2, e)
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10
Explicit
y = a b c a b x a b c $
(1, e) (3, e)
(2, e)
3
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10
Phase 4
y = a b c a b x a b c $
(1, e) (3, e)
(2, e)
3
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10
Implicit
y = a b c a b x a b c $
(1, e) (3, e)
(2, e)
3
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10
Implicit
y = a b c a b x a b c $
(1, e) (3, e)
(2, e)
3
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10
Implicit
y = a b c a b x a b c $
(1, e) (3, e)
(2, e)
3
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Explicit -
y = a b c a b x a b c $ Rule 3
(1, e) (3, e)
(2, e)
3
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10
Phase 5
y = a b c a b x a b c $
↑
(1, e) (3, e)
(2, e)
3
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10
Implicit
y = a b c a b x a b c $
↑
(1, e) (3, e)
(2, e)
3
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10
Implicit
y = a b c a b x a b c $
↑
(1, e) (3, e)
(2, e)
3
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10
Implicit
y = a b c a b x a b c $
↑
(1, e) (3, e)
(2, e)
3
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Explicit -
y = a b c a b x a b c $ Rule 3
↑
(1, e) (3, e)
(2, e)
3
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10
Phase 6
y = a b c a b x a b c $
↑
(1, e) (3, e)
(2, e)
3
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Skip all
y = a b c a b x a b c $ implicit
↑
(1, e) (3, e)
(2, e)
3
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Explicit -
y = a b c a b x a b c $ Rule 2
↑
(1, e) (3, e)
(2, e)
3
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Explicit -
y = a b c a b x a b c $ Rule 2
↑
(1, 2)
(3, e)
(3, e)
(2, e)
3
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Explicit -
y = a b c a b x a b c $ Rule 2
↑
(1, 2)
(3, e)
(6, e)
(3, e)
(2, e)
4 3
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Explicit -
y = a b c a b x a b c $ Rule 2
↑
(1, 2)
(3, e)
(6, e)
(3, e)
(2, e)
4 3
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Explicit -
y = a b c a b x a b c $ Rule 2
↑
(1, 2)
(2, 2)
(3, e)
(6, e)
(3, e)
4 3
(3, e)
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Explicit -
y = a b c a b x a b c $ Rule 2
↑
(1, 2)
(2, 2)
(3, e)
(6, e) (6, e)
(3, e)
4 5 3
(3, e)
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Create
y = a b c a b x a b c $ suffix link
↑
(1, 2)
(2, 2)
(3, e)
(6, e) (6, e)
(3, e)
4 5 3
(3, e)
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Create
y = a b c a b x a b c $ suffix link
↑
(1, 2)
(2, 2)
(3, e)
(6, e) (6, e)
(3, e)
4 5 3
(3, e)
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10
y = a b c a b x a b c $
↑
(1, 2)
(2, 2)
(3, e)
(6, e) (6, e)
(3, e)
4 5 3
(3, e)
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Explicit -
y = a b c a b x a b c $ Rule 2
↑
(1, 2) (6, e)
(2, 2)
(3, e)
6
(6, e) (6, e)
(3, e)
4 5 3
(3, e)
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10
Phase 7
y = a b c a b x a b c $
(1, 2) (6, e)
(2, 2)
(3, e)
6
(6, e) (6, e)
(3, e)
4 5 3
(3, e)
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Skip all
y = a b c a b x a b c $ implicit
(1, 2) (6, e)
(2, 2)
(3, e)
6
(6, e) (6, e)
(3, e)
4 5 3
(3, e)
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Explicit -
y = a b c a b x a b c $ Rule 3
(1, 2) (6, e)
(2, 2)
(3, e)
6
(6, e) (6, e)
(3, e)
4 5 3
(3, e)
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10
Phase 8
y = a b c a b x a b c $
↑
(1, 2) (6, e)
(2, 2)
(3, e)
6
(6, e) (6, e)
(3, e)
4 5 3
(3, e)
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Skip all
y = a b c a b x a b c $ implicit
↑
(1, 2) (6, e)
(2, 2)
(3, e)
6
(6, e) (6, e)
(3, e)
4 5 3
(3, e)
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Explicit -
y = a b c a b x a b c $ Rule 3
↑
(1, 2) (6, e)
(2, 2)
(3, e)
6
(6, e) (6, e)
(3, e)
4 5 3
(3, e)
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10
Phase 9
y = a b c a b x a b c $
↑
(1, 2) (6, e)
(2, 2)
(3, e)
6
(6, e) (6, e)
(3, e)
4 5 3
(3, e)
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Skip all
y = a b c a b x a b c $ implicit
↑
(1, 2) (6, e)
(2, 2)
(3, e)
6
(6, e) (6, e)
(3, e)
4 5 3
(3, e)
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Explicit -
y = a b c a b x a b c $ Rule 3
↑
(1, 2) (6, e)
(2, 2)
(3, e)
6
(6, e) (6, e)
(3, e)
4 5 3
(3, e)
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10
Phase 10
y = a b c a b x a b c $
↑
(1, 2) (6, e)
(2, 2)
(3, e)
6
(6, e) (6, e)
(3, e)
4 5 3
(3, e)
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Skip all
y = a b c a b x a b c $ implicit
↑
(1, 2) (6, e)
(2, 2)
(3, e)
6
(6, e) (6, e)
(3, e)
4 5 3
(3, e)
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Explicit -
y = a b c a b x a b c $ Rule 3
↑
(1, 2) (6, e)
(2, 2)
(3, e)
6
(6, e) (6, e)
(3, e)
4 5 3
(3, e)
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Explicit -
y = a b c a b x a b c $ Rule 2
↑
(1, 2) (6, e)
(2, 2)
(3, e)
6
(3, 3) (6, e) (6, e)
4 5 3
(4, e)
(3, e)
1 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Explicit -
y = a b c a b x a b c $ Rule 2
↑
(1, 2) (6, e)
(2, 2)
(3, e)
6
(3, 3) (6, e) (6, e)
4 5 3
(4, e) (10, e)
(3, e)
1 7 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Follow
y = a b c a b x a b c $ suffix link
↑
(1, 2) (6, e)
(2, 2)
(3, e)
6
(3, 3) (6, e) (6, e)
4 5 3
(4, e) (10, e)
(3, e)
1 7 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Explicit -
y = a b c a b x a b c $ Rule 2
↑
(1, 2) (6, e)
(2, 2)
(3, e)
6
(3, 3) (6, e) (6, e)
4 5 3
(4, e) (10, e)
(3, e)
1 7 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Explicit -
y = a b c a b x a b c $ Rule 2
↑
(1, 2) (6, e)
(2, 2)
(3, e)
6
(3, 3) (6, e)(3, 3) (6, e)
4 5 3
(4, e) (10, e)
(3, e)
1 7 2
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Explicit -
y = a b c a b x a b c $ Rule 2
↑
(1, 2) (6, e)
(2, 2)
(3, e)
6
(3, 3) (6, e)(3, 3) (6, e)
4 5 3
(4, e) (10, e) (10, e)
(3, e)
1 7 2 8
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Follow
y = a b c a b x a b c $ suffix link
↑
(1, 2) (6, e)
(2, 2)
(3, e)
6
(3, 3) (6, e)(3, 3) (6, e)
4 5 3
(4, e) (10, e) (10, e)
(3, e)
1 7 2 8
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Explicit -
y = a b c a b x a b c $ Rule 2
↑
(1, 2) (6, e)
(2, 2)
(3, e)
6
(3, 3) (6, e)(3, 3) (6, e)
4 5 3
(4, e) (10, e) (10, e)
(3, e)
1 7 2 8
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Explicit -
y = a b c a b x a b c $ Rule 2
↑
(1, 2) (6, e)
(2, 2) (3, 3)
6
(3, 3) (6, e)(3, 3) (6, e) (4, e)
4 5 3
(4, e) (10, e) (10, e)
(3, e)
1 7 2 8
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Explicit -
y = a b c a b x a b c $ Rule 2
↑
(1, 2) (6, e)
(2, 2) (3, 3)
6
(3, 3) (6, e)(3, 3) (6, e) (4, e)
(10, e)
4 5 9 3
(4, e) (10, e) (10, e)
(3, e)
1 7 2 8
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Create
y = a b c a b x a b c $ suffix link
↑
(1, 2) (6, e)
(2, 2) (3, 3)
6
(3, 3) (6, e)(3, 3) (6, e) (4, e)
(10, e)
4 5 9 3
(4, e) (10, e) (10, e)
(3, e)
1 7 2 8
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Create
y = a b c a b x a b c $ suffix link
↑
(1, 2) (6, e)
(2, 2) (3, 3)
6
(3, 3) (6, e)(3, 3) (6, e) (4, e)
(10, e)
4 5 9 3
(4, e) (10, e) (10, e)
(3, e)
1 7 2 8
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10
y = a b c a b x a b c $
↑
(1, 2) (6, e)
(2, 2) (3, 3)
6
(3, 3) (6, e)(3, 3) (6, e) (4, e)
(10, e)
4 5 9 3
(4, e) (10, e) (10, e)
(3, e)
1 7 2 8
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
↓
1 2 3 4 5 6 7 8 9 10 Explicit -
y = a b c a b x a b c $ Rule 2
↑
(1, 2) (6, e)
(2, 2) (3, 3)
(10, e)
10 6
(3, 3) (6, e)(3, 3) (6, e) (4, e)
(10, e)
4 5 9 3
(4, e) (10, e) (10, e)
(3, e)
1 7 2 8
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
1 2 3 4 5 6 7 8 9 10
y = a b c a b x a b c $
ab xabc$
b c $
10 6
c xabc$ c xabc$
$ abxabc$
4 5 9 3
abxabc$ $ $
abxabc$
1 7 2 8
ab xabc$
b c $
10 6
c xabc$ c xabc$
$ abxabc$
4 5 9 3
abxabc$ $ $
abxabc$
1 7 2 8
ab xabc$
b c $
10 6
c xabc$ c xabc$
$ abxabc$
4 5 9 3
abxabc$ $ $
abxabc$
1 7 2 8
↓ ↓ ↓ Leaves indicate
1 2 3 4 5 6 7 8 9 10 the starting posi-
y = a b c a b x a b c $ tions of a
ab xabc$
b c $
10 6
c xabc$ c xabc$
$ abxabc$
4 5 9 3
abxabc$ $ $
abxabc$
1 7 2 8
Contents
1 Introduction
2 Basic Definitions
3 Dictionaries
4 Suffix tree
5 Example
6 Overview
Introduction Basic Definitions Dictionaries Suffix tree Example Overview
Overview
We’ve seen what suffix trees are and some of their properties.
Patricia suffix tries for a string x [1 . . n]
At most 2n − 1 nodes
Exactly n leaves