0% found this document useful (0 votes)
42 views2 pages

6.851 Advanced Data Structures (Spring'12) Prof. Erik Demaine Problem 9 Sample Solution

This document provides a sample solution for string matching using a compressed suffix trie constructed from the concatenation of two strings S1 and S2 separated by delimiter symbols. It explains that the trie can be constructed in O(|S1| + |S2| + sort(Σ)) time and trimmed to focus on S1 and S2 in additional O(|S1| + |S2|) time. It then describes how to traverse the trie to count the number of substrings of S1 of length at least k that occur in S2 in O(|S1| + |S2|) time by accounting for substrings represented by compressed trie edges.

Uploaded by

djoseph_1
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
42 views2 pages

6.851 Advanced Data Structures (Spring'12) Prof. Erik Demaine Problem 9 Sample Solution

This document provides a sample solution for string matching using a compressed suffix trie constructed from the concatenation of two strings S1 and S2 separated by delimiter symbols. It explains that the trie can be constructed in O(|S1| + |S2| + sort(Σ)) time and trimmed to focus on S1 and S2 in additional O(|S1| + |S2|) time. It then describes how to traverse the trie to count the number of substrings of S1 of length at least k that occur in S2 in O(|S1| + |S2|) time by accounting for substrings represented by compressed trie edges.

Uploaded by

djoseph_1
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

6.851 Advanced Data Structures (Spring12) Prof.

Erik Demaine Problem 9 Sample solution

String matching. Construct a compressed sux trie over S = S1 $1 S2 $2 , and then trim below $1 so that all leaves are either $1 or $2 . The construction takes O(|S1 | + |S2 | + sort()), and the trimming can be done with a traversal of the trie in O(|S1 | + |S2 |) time. Traverse the trie and record for each node whether it has a $2 in its subtree as well as the number of $1 s in its subtree. A given node corresponds to a substring appearing in S2 i it has an $2 in its subtree and it corresponds to a number of substrings in S1 equal to the number of $1 s in its subtree. To compute the total number of substrings of S1 of length at least k that occur in S2 , traverse the tree and sum the number of $1 leaves for each node which has an $2 leaf and has letter depth at least k . This procedure would work correctly on an uncompressed sux trie, but it does not account for those substrings corresponding to nodes compressed into edges. To correct for this, when adding the number of $1 leaves for a node, multiply it by the length of the edge to its parent (or if the parent has letter depth less than k , multiply it by the nodes letter depth minus k 1).

MIT OpenCourseWare http://ocw.mit.edu

6.851 Advanced Data Structures


Spring 2012

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

You might also like