0% found this document useful (0 votes)
60 views

Longest Common Subsequences

Longest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common Longest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common Subsequen

Uploaded by

p4patelkeyur
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
60 views

Longest Common Subsequences

Longest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common Longest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common SubsequencesLongest Common Subsequences Longest Common Subsequen

Uploaded by

p4patelkeyur
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

ICS161:DesignandAnalysisofAlgorithms

LecturenotesforFebruary29,1996
LongestCommonSubsequences
Inthislectureweexamineanotherstringmatchingproblem,offindingthelongestcommon
subsequenceoftwostrings.
Thisisagoodexampleofthetechniqueofdynamicprogramming,whichisthefollowingverysimple
idea:startwitharecursivealgorithmfortheproblem,whichmaybeinefficientbecauseitcallsitself
repeatedlyonasmallnumberofsubproblems.Simplyrememberthesolutiontoeachsubproblemthe
firsttimeyoucomputeit,thenafterthatlookitupinsteadofrecomputingit.Theoveralltimebound
thenbecomes(typically)proportionaltothenumberofdistinctsubproblemsratherthanthelarger
numberofrecursivecalls.Wealreadysawthisideabrieflyinthefirstlecture.
Aswe'llsee,therearetwowaysofdoingdynamicprogramming,topdownandbottomup.Thetop
down(memoizing)methodisclosertotheoriginalrecursivealgorithm,soeasiertounderstand,but
thebottomupmethodisusuallyalittlemoreefficient.

Subsequencetesting
Beforewedefinethelongestcommonsubsequenceproblem,let'sstartwithaneasywarmup.Suppose
you'regivenashortstring(pattern)andlongstring(text),asinthestringmatchingproblem.Butnow
youwanttoknowifthelettersofthepatternappearinorder(butpossiblyseparated)inthetext.If
theydo,wesaythatthepatternisasubsequenceofthetext.
Asanexample,is"nano"asubsequenceof"nematodeknowledge"?Yes,andinonlyoneway.The
easiestwaytoseethisexampleistocapitalizethesubsequence:"NemAtodekNOwledge".
Ingeneral,wecantestthisasbeforeusingafinitestatemachine.Drawcirclesandarrowsasbefore,
correspondingtopartialsubsequences(prefixesofthepattern),butnowthereisnoneedfor
backtracking.

Equivalently,itiseasytowritecodeorpseudocodeforthis:
subseq(char * P, char * T)
{
while (*T != '\0')
if (*P == *T++ && *++P == '\0')
return TRUE;
return FALSE;
}

Longestcommonsubsequenceproblem

Whatifthepatterndoesnotoccurinthetext?Itstillmakessensetofindthelongestsubsequencethat
occursbothinthepatternandinthetext.Thisisthelongestcommonsubsequenceproblem.Sincethe
patternandtexthavesymmetricroles,fromnowonwewon'tgivethemdifferentnamesbutjustcall
themstringsAandB.We'llusemtodenotethelengthofAandntodenotethelengthofB.
Notethattheautomatatheoreticmethodabovedoesn'tsolvetheprobleminsteaditgivesthelongest
prefixofAthat'sasubsequenceofB.ButthelongestcommonsubsequenceofAandBisnotalways
aprefixofA.
Whymightwewanttosolvethelongestcommonsubsequenceproblem?Thereareseveralmotivating
applications.
Molecularbiology.DNAsequences(genes)canberepresentedassequencesoffourletters
ACGT,correspondingtothefoursubmoleculesformingDNA.Whenbiologistsfindanew
sequences,theytypicallywanttoknowwhatothersequencesitismostsimilarto.Onewayof
computinghowsimilartwosequencesareistofindthelengthoftheirlongestcommon
subsequence.
Filecomparison.TheUnixprogram"diff"isusedtocomparetwodifferentversionsofthesame
file,todeterminewhatchangeshavebeenmadetothefile.Itworksbyfindingalongest
commonsubsequenceofthelinesofthetwofilesanylineinthesubsequencehasnotbeen
changed,sowhatitdisplaysistheremainingsetoflinesthathavechanged.Inthisinstanceof
theproblemweshouldthinkofeachlineofafileasbeingasinglecomplicatedcharacterina
string.
Screenredisplay.Manytexteditorslike"emacs"displaypartofafileonthescreen,updating
thescreenimageasthefileischanged.Forslowdialinterminals,theseprogramswanttosend
theterminalasfewcharactersaspossibletocauseittoupdateitsdisplaycorrectly.Itispossible
toviewthecomputationoftheminimumlengthsequenceofcharactersneededtoupdatethe
terminalasbeingasortofcommonsubsequenceproblem(thecommonsubsequencetellsyou
thepartsofthedisplaythatarealreadycorrectanddon'tneedtobechanged).
(Asanaside,itisnaturaltodefineasimilarlongestcommonsubstringproblem,askingforthelongest
substringthatappearsintwoinputstrings.Thisproblemcanbesolvedinlineartimeusingadata
structureknownasthesuffixtreebutthesolutionisextremelycomplicated.)

Recursivesolution
Sowewanttosolvethelongestcommonsubsequenceproblembydynamicprogramming.Todothis,
wefirstneedarecursivesolution.Thedynamicprogrammingideadoesn'ttellushowtofindthis,it
justgivesusawayofmakingthesolutionmoreefficientoncewehave.
Let'sstartwithsomesimpleobservationsabouttheLCSproblem.Ifwehavetwostrings,say
"nematodeknowledge"and"emptybottle",wecanrepresentasubsequenceasawayofwritingthe
twosothatcertainletterslineup:
nematode kno w ledge
|| |
|
|
||
empty
bottle

Ifwedrawlinesconnectingthelettersinthefirststringtothecorrespondinglettersinthesecond,no
twolinescross(thetopandbottomendpointsoccurinthesameorder,theorderofthelettersinthe
subsequence).Converselyanysetoflinesdrawnlikethis,withoutcrossings,representsa
subsequence.

Fromthiswecanobservethefollowingsimplefact:ifthetwostringsstartwiththesameletter,it's
alwayssafetochoosethatstartingletterasthefirstcharacterofthesubsequence.Thisisbecause,if
youhavesomeothersubsequence,representedasacollectionoflinesasdrawnabove,youcan"push"
theleftmostlinetothestartofthetwostrings,withoutcausinganyothercrossings,andgeta
representationofanequallylongsubsequencethatdoesstartthisway.
Ontheotherhand,supposethat,liketheexampleabove,thetwofirstcharactersdiffer.Thenitisnot
possibleforbothofthemtobepartofacommonsubsequenceoneortheother(ormaybeboth)will
havetoberemoved.
Finally,observethatoncewe'vedecidedwhattodowiththefirstcharactersofthestrings,the
remainingsubproblemisagainalongestcommonsubsequenceproblem,ontwoshorterstrings.
Thereforewecansolveitrecursively.
Ratherthanfindingthesubsequenceitself,itturnsouttobemoreefficienttofindthelengthofthe
longestsubsequence.Theninthecasewherethefirstcharactersdiffer,wecandeterminewhich
subproblemgivesthecorrectsolutionbysolvingbothandtakingthemaxoftheresultingsubsequence
lengths.Onceweturnthisintoadynamicprogramwewillseehowtogetthesequenceitself.
Theseobservationsgiveusthefollowing,veryinefficient,recursivealgorithm.
RecursiveLCS:
int lcs_length(char * A, char * B)
{
if (*A == '\0' || *B == '\0') return 0;
else if (*A == *B) return 1 + lcs_length(A+1, B+1);
else return max(lcs_length(A+1,B), lcs_length(A,B+1));
}

Thisisacorrectsolutionbutit'sverytimeconsuming.Forexample,ifthetwostringshaveno
matchingcharacters,sothelastlinealwaysgetsexecuted,thethetimeboundsarebinomial
coefficients,which(ifm=n)arecloseto2^n.

Memoization
Theproblemwiththerecursivesolutionisthatthesamesubproblemsgetcalledmanydifferenttimes.
Asubproblemconsistsofacalltolcs_length,withtheargumentsbeingtwosuffixesofAandB,so
thereareexactly(m+1)(n+1)possiblesubproblems(arelativelysmallnumber).Iftherearenearly2^n
recursivecalls,someofthesesubproblemsmustbebeingsolvedoverandover.
Thedynamicprogrammingsolutionistocheckwheneverwewanttosolveasubproblem,whether
we'vealreadydoneitbefore.Ifsowelookupthesolutioninsteadofrecomputingit.Implementedin
themostdirectway,wejustaddsomecodetoourrecursivealgorithmtodothislookupthis"top
down",recursiveversionofdynamicprogrammingisknownas"memoization".
IntheLCSproblem,subproblemsconsistofapairofsuffixesofthetwoinputstrings.Tomakeit
easiertostoreandlookupsubproblemsolutions,I'llrepresentthesebythestartingpositionsinthe
strings,ratherthan(asIwroteitabove)characterpointers.
RecursiveLCSwithindices:
char * A;
char * B;
int lcs_length(char * AA, char * BB)
{

A = AA; B = BB;
return subproblem(0, 0);

}
int subproblem(int i, int j)
{
if (A[i] == '\0' || B[j] == '\0') return 0;
else if (A[i] == B[j]) return 1 + subproblem(i+1, j+1);
else return max(subproblem(i+1, j), subproblem(i, j+1));
}

Nowtoturnthisintoadynamicprogrammingalgorithmweneedonlyuseanarraytostorethe
subproblemresults.Whenwewantthesolutiontoasubproblem,wefirstlookinthearray,andcheck
iftherealreadyisasolutionthere.Ifsowereturnitotherwiseweperformthecomputationandstore
theresult.IntheLCSproblem,noresultisnegative,sowe'lluse1asaflagtotellthealgorithmthat
nothinghasbeenstoredyet.
MemoizingLCS:
char * A;
char * B;
array L;
int lcs_length(char * AA, char * BB)
{
A = AA; B = BB;
allocate storage for L;
for (i = 0; i <= m; i++)
for (j = 0; j <= m; j++)
L[i,j] = -1;
}

return subproblem(0, 0);

int subproblem(int i, int j)


{
if (L[i,j] < 0) {
if (A[i] == '\0' || B[j] == '\0') L[i,j] = 0;
else if (A[i] == B[j]) L[i,j] = 1 + subproblem(i+1, j+1);
else L[i,j] = max(subproblem(i+1, j), subproblem(i, j+1));
}
return L[i,j];
}

Timeanalysis:eachcalltosubproblemtakesconstanttime.Wecallitoncefromthemainroutine,and
atmosttwiceeverytimewefillinanentryofarrayL.Thereare(m+1)(n+1)entries,sothetotal
numberofcallsisatmost2(m+1)(n+1)+1andthetimeisO(mn).
Asusual,thisisaworstcaseanalysis.Thetimemightsometimesbetter,ifnotallarrayentriesget
filledout.Forinstanceifthetwostringsmatchexactly,we'llonlyfillindiagonalentriesandthe
algorithmwillbefast.

Bottomupdynamicprogramming
Wecanviewthecodeaboveasjustbeingaslightlysmarterwayofdoingtheoriginalrecursive
algorithm,savingworkbynotrepeatingsubproblemcomputations.Butitcanalsobethoughtofasa
wayofcomputingtheentriesinthearrayL.Therecursivealgorithmcontrolswhatorderwefillthem
in,butwe'dgetthesameresultsifwefilledthemininsomeotherorder.Wemightaswelluse
somethingsimpler,likeanestedloop,thatvisitsthearraysystematically.Theonlythingwehaveto
worryaboutisthatwhenwefillinacellL[i,j],weneedtoalreadyknowthevaluesitdependson,
namelyinthiscaseL[i+1,j],L[i,j+1],andL[i+1,j+1].Forthisreasonwe'lltraversethearray

backwards,fromthelastrowworkinguptothefirstandfromthelastcolumnworkinguptothefirst.
Thisisiterative(becauseitusesnestedloopsinsteadofrecursion)orbottomup(becausetheorderwe
fillinthearrayisfromsmallersimplersubproblemstobiggermorecomplicatedones).
IterativeLCS:
int lcs_length(char * A, char * B)
{
allocate storage for array L;
for (i = m; i >= 0; i--)
for (j = n; j >= 0; j--)
{
if (A[i] == '\0' || B[j] == '\0') L[i,j] = 0;
else if (A[i] == B[j]) L[i,j] = 1 + L[i+1, j+1];
else L[i,j] = max(L[i+1, j], L[i, j+1]);
}
return L[0,0];
}

Advantagesofthismethodincludethefactthatiterationisusuallyfasterthanrecursion,wedon'tneed
toinitializethematrixtoall1's,andwesavethreeifstatementsperiterationsincewedon'tneedto
testwhetherL[i,j],L[i+1,j],andL[i,j+1]havealreadybeencomputed(weknowinadvancethatthe
answerswillbeno,yes,andyes).Onedisadvantageovermemoizingisthatthisfillsintheentire
arrayevenwhenitmightbepossibletosolvetheproblembylookingatonlyafractionofthearray's
cells.

Thesubsequenceitself
Whatifyouwantthesubsequenceitself,andnotjustitslength?Thisisimportantforsomebutnotall
oftheapplicationswementioned.OncewehavefilledinthearrayLdescribedabove,wecanfindthe
sequencebyworkingforwardsthroughthearray.
sequence S = empty;
i = 0;
j = 0;
while (i < m && j < n)
{
if (A[i]==B[j])
{
add A[i] to end of S;
i++; j++;
}
else if (L[i+1,j] >= L[i,j+1]) i++;
else j++;
}

Let'sseeanexampleofthis.Here'sthearrayfortheearlierexample:
n e m a t o d e _ k n o w l e d g e
e
m
p
t
y
_
b
o
t
t
l

7
6
5
5
4
4
3
3
3
3
2

7
6
5
5
4
4
3
3
3
3
2

6
6
5
5
4
4
3
3
3
3
2

5
5
5
5
4
4
3
3
3
3
2

5
5
5
5
4
4
3
3
3
3
2

5
4
4
4
4
4
3
3
2
2
2

5
4
4
4
4
4
3
3
2
2
2

5
4
4
4
4
4
3
3
2
2
2

4
4
4
4
4
4
3
3
2
2
2

3
3
3
3
3
3
3
3
2
2
2

3
3
3
3
3
3
3
3
2
2
2

3
3
3
3
3
3
3
3
2
2
2

2
2
2
2
2
2
2
2
2
2
2

2
2
2
2
2
2
2
2
2
2
2

2
1
1
1
1
1
1
1
1
1
1

1
1
1
1
1
1
1
1
1
1
1

1
1
1
1
1
1
1
1
1
1
1

1
1
1
1
1
1
1
1
1
1
1

0
0
0
0
0
0
0
0
0
0
0

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

(youcancheckthateachentryiscomputedcorrectlyfromtheentriesbelowandtotheright.)
Tofindthelongestcommonsubsequence,lookatthefirstentryL[0,0].Thisis7,tellingusthatthe
sequencehassevencharacters.L[0,0]wascomputedasmax(L[0,1],L[1,0]),correspondingtothe
subproblemsformedbydeletingeitherthe"n"fromthefirststringorthe"e"fromthesecond.
Deletingthe"n"givesasubsequenceoflengthL[0,1]=7,butdeletingthe"e"onlygivesL[1,0]=6,so
wecanonlydeletethe"n".Nowlet'slookattheentryL[0,1]comingfromthisdeletion.
A[0]=B[1]="e"sowecansafelyincludethis"e"aspartofthesubsequence,andmovetoL[1,2]=6.
Similarlythisentrygivesusan"m"inoursequence.Continuinginthisway(andbreakingtiesasin
thealgorithmabove,bymovingdowninsteadofacross)givesthecommonsubsequence"emtole".
SowecanfindlongestcommonsubsequencesintimeO(mn).Actually,ifyoulookatthematrix
above,youcantellthatithasalotofstructurethenumbersinthematrixformlargeblocksinwhich
thevalueisconstant,withonlyasmallnumberof"corners"atwhichthevaluechanges.Itturnsout
thatonecantakeadvantageofthesecornerstospeedupthecomputation.Thecurrent(theoretically)
fastestalgorithmforlongestcommonsubsequences(duetomyselfandcoauthors)runsintimeO(n
logs+cloglogmin(c,mn/c))wherecisthenumberofthesecorners,andsisthenumberof
charactersappearinginthetwostrings.

Relationtopathsingraphs
Let'sdrawadirectedgraph,withverticescorrespondingtoentriesinthearrayL,andanedge
connectinganentrytooneitdependson:eitheroneedgetoL[i+1,j+1]ifA[i]=B[j],ortwoedgesto
L[i+1,j]andL[i,j+1]otherwise.Don'tdrawedgesfromthebottomrightfringeofthearray(since
thoseentriesdon'tdependonanyothers).
nematode_knowledge
e o-o o-o-o-o-o-o o-o-o-o-o-o-o o-o-o o
| \| | | | | \| | | | | | \| | \|
m o-o-o o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o
| | \| | | | | | | | | | | | | | | |
p o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o
|||||||||||||||||||
t o-o-o-o-o o-o-o-o-o-o-o-o-o-o-o-o-o-o
| | | | \| | | | | | | | | | | | | |
y o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o
|||||||||||||||||||
_ o-o-o-o-o-o-o-o-o o-o-o-o-o-o-o-o-o-o
| | | | | | | | \| | | | | | | | | |
b o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o-o
|||||||||||||||||||
o o-o-o-o-o-o o-o-o-o-o-o o-o-o-o-o-o-o
| | | | | \| | | | | \| | | | | | |
t o-o-o-o-o o-o-o-o-o-o-o-o-o-o-o-o-o-o
| | | | \| | | | | | | | | | | | | |
t o-o-o-o-o o-o-o-o-o-o-o-o-o-o-o-o-o-o
| | | | \| | | | | | | | | | | | | |
l o-o-o-o-o-o-o-o-o-o-o-o-o-o o-o-o-o-o
| | | | | | | | | | | | | \| | | | |
e o-o o-o-o-o-o-o o-o-o-o-o-o-o o-o-o o
| \| | | | | \| | | | | | \| | \|
ooooooooooooooooooo

Thenifyoulookatanypathinthegraph,thediagonaledgesformasubsequenceofthetwostrings.
Conversely,ifyoudefinethehorizontalandverticaledgestohavelengthzero,andthediagonaledges

tohavelengthone,thelongestcommonsubsequencecorrespondstothelongestpathfromthetopleft
cornertooneofthebottomrightvertices.Thisgraphisacyclic,sowecancomputelongestpathsin
timelinearinthesizeofthegraph,hereO(mn).
Wheredidtheseedgelengthscomefrom?They'rejusttheamountbywhichtheLCSlengthincreases
comparedtothelengthatthecorrespondingsubproblem.IfA[i]=B[j],thenL[i,j]=L[i+1,j+1]+1,and
weusethatlast"+1"astheedgelength.Otherwise,L[i,j]=max(L[i+1,j],L[i,j+1])+0,soweusezeroas
theedgelength.
Thissortofphenomenon,inwhichadynamicprogrammingproblemturnsouttobeequivalenttoa
shortestorlongestpathproblem,doesnotalwayshappenwithotherproblems,butitisreasonably
common.Thisideadoesn'treallyhelpcomputethesinglelongestcommonsubsequence,butoneof
mypapersusessimilargraphtheoreticideastofindmultiplelongcommonsubsequences(and
multiplesolutionstomanyotherproblems).

Reducedspacecomplexity
Onedisadvantageofthedynamicprogrammingmethodswe'vedescribed,comparedtotheoriginal
recursion,isthattheyusealotofspace:O(mn)forthearrayL(therecursiononlyusesO(n+m)).But
theiterativeversioncanbeeasilymodifiedtouselessspacetheobservationisthatoncewe've
computedrowiofarrayL,wenolongerneedthevaluesinrowi+1.
SpaceefficientLCS:
int lcs_length(char * A, char * B)
{
allocate storage for one-dimensional arrays X and Y
for (i = m; i >= 0; i--)
{
for (j = n; j >= 0; j--)
{
if (A[i] == '\0' || B[j] == '\0') X[j] = 0;
else if (A[i] == B[j]) X[j] = 1 + Y[j+1];
else X[j] = max(Y[j], X[j+1]);
}
Y = X;
}
return X[0];
}

Thistakesroughlythesameamountoftimeasbefore,O(mn)itusesalittlemoretimetocopyX
intoYbutthisonlyincreasesthetimebyaconstant(andcanbeavoidedwithsomemorecare).The
spaceiseitherO(m)orO(n),whicheverissmaller(switchthetwostringsifnecessarysothereare
morerowsthancolumns).Unfortunately,thissolutiondoesnotleaveyouwithenoughinformationto
findthesubsequenceitself,justitslength.
In1975,DanHirschbergshowedhowtofindnotjustthelength,butthelongestcommonsubsequence
itself,inlinearspaceandO(mn)time.TheideaisasabovetouseonedimensionalarraysXandYto
storetherowsofthelargertwodimensionalarrayL.ButHirschberg'smethodtreatsthemiddlerowof
arrayLspecially:foralli<m/2,hestoresalongwiththenumbersinXandYtheplacewheresome
path(correspondingtoasubsequencewiththatmanycharacters)crossesthemiddlerow.These
crossingplacescanbeupdatedalongwiththearrayvalues,bycopyingthemfromX[j+1],Y[j],or
Y[j+1]asappropriate.
ThenwhenthealgorithmabovehasfinishedwiththeLCSlengthinX[0],Hirschbergfindsthe
correspondingcrossingplace(m/2,k).HethensolvesrecursivelytwoLCSproblems,onefor
A[0..m/21]andB[0..k1]andoneforA[m/2..m]andB[k..n].Thelongestcommonsubsequenceis

theconcatenationofthesequencesfoundbythesetworecursivecalls.
Itisnothardtoseethatthismethoduseslinearspace.Whatabouttimecomplexity?Thisisa
recursivealgorithm,withatimerecurrence
T(m,n) = O(mn) + T(m/2,k) + T(m/2,n-k)

Youcanthinkofthisassortoflikequicksortwe'rebreakingbothstringsintoparts.Butunlike
quicksortitdoesn'tmatterthatthesecondstringcanbebrokenunequally.Nomatterwhatkis,the
recurrencesolvestoO(mn).Theeasiestwaytoseethisistothinkaboutwhatit'sdoinginthearrayL.
Themainpartofthealgorithmvisitsthewholearray,thenthetwocallsvisittwosubarrays,oneabove
andleftof(m/2,k)andtheotherbelowandtotheright.Nomatterwhatkis,thetotalsizeofthesetwo
subarraysisroughlymn/2.Soinsteadwecanwriteasimplifiedrecurrence
T(mn) = O(mn) + T(mn/2)

whichsolvestoO(mn)timetotal.
ICS161Dept.Information&ComputerScienceUCIrvine
Lastupdate:

You might also like