0% found this document useful (0 votes)
41 views39 pages

TOC Lec1

Uploaded by

Mohammed Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
41 views39 pages

TOC Lec1

Uploaded by

Mohammed Ahmed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 39
Text Books Peter Linz. An Introduction to Formal Languages and Automata, 6" Edition, Jones & Bartlett Learning, India. Michael Sipser. Introduction to the Theory of Computation 2"4 Edition, Thomson Course Technology. . John E. Hopcroft, Rajeev Motwani, and Jeffrey D. Ullman Introduction to Automata Theory, Languages, and Computation 2"4 Edition. Addison-Wesley. . John C. Martin. Introduction to Languages and the Theory of Computation 3% Edition. McGraw-Hill. Harry R. Lewis & Christos H. Papadimitriou. Elements of the Theory of Computation, 2"! Ed., PHI K. L. P. Mishra. Theory of Computer Science, 3 Ed., Prentice Hall India Theory of Computation Mathematical Preliminaries *Sets, Relations, Functions Graphs ¢Proof Techniques eLanguages ¢Grammars Languages : Basics + Alphabet QQ _ Analphabet (¥) is a finite, non-empty set of symbols. Q Example: QQ {0,1} -- Binary alphabet. OQ {1,....,9.....} = Alphabet for natural numbers Q {4,B,..., ,2} ---- English alphabet. fae BAL cesesses Zs Oss css 9s } — Alphabet for C variables The set of all ASCII characters QO Etc. * String Q _ Astring is a finite sequence of symbols from the alphabet E. Q Example- Q 0,1, 11, 00, and 01101 -- Q Cat, CAT, and, AND, compute - Strings over {0,1 } Strings over the English alphabet. Q a,b, ab, aa, ba, bb, aaa, aab, abaa, —-— Strings over {a, b }. Q abe, num, const], const2 ----- Strings over alphabet for C variables CQ —_Note- Infinite strings exist over any alphabet Language : Basics + Empty String O The string with zero occurrences of symbols from the alphabet 5 2) It is denoted by e, € or A. C1 Astring without symbols over any alphabet. String Length O The length of a string w is the number of occurrences of symbols in w Ci Denoted by |w| or length(w) O Example- Let £ = {a, b, ..., z, A, B, .... Z}, Olength(automata; , length(computation) = 11 Cllength(A) = 0, O Example- Let £ = (0, 1}, |A|=0 Q |1010|= 4, |0000000| =7, |1010101010| =10, Q wii), denotes the symbol in the it* position of a string w, for 1< i < length(w) Caution: Sets S=t)-ta Set size k= S|=o Set size kaa] =2 laj-0 String length String Operations * Substring © zis.a substring of w if it is a sequence of consecutive symbols of w. * Substrings of “Theory” are ‘T’, ‘Th’, ‘eo’ or ‘Theory’. * Prefix (Leading substring) © A prefix of a string w is any leading contiguous part of w. © zis a prefix of string w if there is a string x such that w = zx © Prefix of “Theory” are ‘T’, ‘Th’, ‘Theo’ © Example- w= abbab * A= {h,a, ab, abb, abba, abbab} is the set of all prefixes. * Suffix (Trailing substring) © A suffix of the string w is any trailing contiguous part of w. © zisa suffix of w if there is a string x such that w = xz. © Suffix of ‘love’ in “Iovetheory” is ‘theory’ © Example- w = abbab © B= {A,b, ab, bab, bbab, abbab} is the set of all suffixes. String Operations * Concatenation Ifx and y are two strings, then xy is obtained by appending the symbols of y at the right end of x O Examples Ox =O, dy Y= Byby aby Y= Oy Oh yb, «by OQ x= 01101, y= 110, xy = 01101110 a QQ Note: If u and v are strings, then the length of their concatenation is the sum of the individual lengths D Juv] = Jul + lvl * Power of a string w © Given any natural number i, ‘© w® represents the string obtained by repeating (concatenating) w n times. dx=x String Operations * Reverse Q The reverse of a string is obtained by writing the symbols in reverse order. OQ The reversal of a string w A if \we0, w= u’a if w=au, Where ae 5 andue S* O Property = (xy)" = y*x O Examples- Closure of Alphabet Kleen’s Closure - =* Q The set of strings obtained by concatenating zero or more symbols from the alphabet Dis denoted by >". Q The 5° consists of strings of every possible length. OE = Ut x O Example- OLet 5 = {0, 7}. OE" = {2, 0, 1, 00, 01, 10, 11, 000, 001, 010, 011, 100, 101, 110, 111, U i000 Positive Closure - =* O The set of strings created by concatenating at least one symbol (1 or 2 or ...) from the alphabet ¥ is denoted by E+. Q The E* does not contain the empty string 2. Q The * is an infinite set since there is no limit on the length of the strings. OE = U4 2! = Urge Be 29 = U pg Bie AA} O Example- Let ¥ = {0, 1}. O*= {0, 1, 00, 01, 10, 11, 000, 001, 010, 011, 100, 101, 110, 111, .. Closure of Alphabet + The following relations exist- Vari (ay x =z" - {A} + Find Z* and =* = Z= {a,b}. = Dafa be I ZAB CG cerecrcee X,Y, Z}. = S={0,1,2,....9} = Z=(01} * Find Kleen’s and positive closure of the given set — A= {00, 11, 10} Language Definition A “Language” is a set of strings over an alphabet of symbols. A language is a subset of E*. A language i system of communication Consisting of symbols used by the people of a particular country or region for talking, writing, or establishing communication. Familiar languages — Natural languages like English, Hindi, etc. + English may be considered a set of English words or sentences for which many grammar rules have been developed. — Programming languages like C, C++, Java, Python, etc. + Programming language may be considered a set of valid identifiers or statements for which many rules (compilers) have been developed. Observation: Language = Alphabet + Some Rules (Grammar) Language The tone, the choice of words, and the way the words are put together vary between the two styles, Formal and informal Languages + Formal language is less personal than informal language. It is used when writing for professional or academic purposes like university assignments. + Informal language is more casual and spontaneous. It is used when communicating with friends or family, either in writing or in conversation. Formal Language + In logic, mathematics, computer science, and linguistics, — A formal language consists of words whose letters are taken from an alphabet and are well-formed according to a specific set of rules. + The alphabet of a formal language consists of symbols, letters, or tokens that concatenate into strings of the language. + Example = Regular language Context-Free Language Context-Sensitive Language, ete. Languages English alphabet © = {a, b, .....x,¥, 2A, B,C, X,Y, Z} — E*= A,a,b,... z,an, ball, cat, set, es, ...A, .. Z, APPLE,....} = The set of all English strings over the English alphabet. — T= (exis valid English word} y+ The set of valid Arabic words over the Arabic alphabet. {n | nisa prime number>20} = {23,29,31,37,...) 2 fx:xisaprime number) The set of strings with even numbers of 1’s over {0, 1} = E*={0,1}*= {2, 0, 1, 00, 01, 10, 11, 000, 001, 010, O11, 100,....... } — L= {we¥* | the number of 1’s in @ is even} = {A, 0, 00, 11, 000, 110, 101, 011, 0000, 1100, 1010, 1001, 0110, 0101, 0011, ... } -.Lte x ‘The set of strings consisting of n 0's followed by n I's = L,= (2,01, 0011, 000111, ...} over E= {0,1} ‘The set of strings with an equal number of 0's and 1’s — L,= (4,01, 10, 0011, 0101, 1010, 1001, 1100, ...} over E= {0, 1} Find the strings of language L, ={a"b: n= 0} over E = {a,b}. pe w a Language Examples |. The empty language 2. » (Ava, aab), another finite language. }. The language Pal of palindromes over {«, 5) (strings such as aba or baab that are unchanged when the order of the symbols is reversed). ix € fa, BI | ng(x) > men). . (x € [a, 6)* | |x| = 2 and x begins and ends with b}. . The language of legal Java identifiers. » The language Expr of legal algebraic expressions involving the identifier a. the binary operations + and *, and parentheses. Some of the strings in the language are a. a +a a, and (a + a2 (a +a)). . The language Balanced of balanced strings of parentheses (strings containing the occurrences of parentheses in some legal algebraic expression). Some: elements are A, ()(()), and (¢((0))))). The language of numeric “literals” in Java, such as —41, 0.03, and 3.0E—3. . The language of legal Java programs. Here the alphabet would include upper- and lowercase alphabetic symbols, numerical digits, blank spaces, and punctuation and other special symbols. Operations on Languages Let ¥ be any finite alphabet {a,, a),..., a,} Any subset L of 2* is called a language over XL. Language is defined as a set. — The set operations are applicable to languages. + Union, intersection, difference, complement. — Additional operations! + Reversal, Concatenation, Kleen’s Closure (Kleen *), Positive closure, ... Example: + L, ={a, b, ab, ba} over Z = {a, b} + L,= {a, b, aa, bb} over Z = {a, b} Operations on Languages * Union — Let L; and L, be languages over an alphabet 5. — The union of L, and L;, denoted by L;UL; = {x|xis in L, or L;} — Example: + L, ={xe {0,1}*[x begins with 0}, L,= {xe {0,1}*|x ends with 0} + L, UL, = {& € {0,1}*| x begins or ends with 0} = {0, 01, 10, 010, . 011111, 011110, 111110,.. + Intersection — Let L, and L, be languages over an alphabet ©. — The intersection of L, and L,, denoted by L;AL, = { x |x isin L, and L,}. — Example- + L, = {xe {0,1}4] x begins with 0}, Ly = {xe {0,1}*] x ends with 0} + LiAL= { xe{0,1}*| x begins and ends with 0} = {0, 00, 010, 000, 0000, 0010, 0100, 0110, ......... } Operations on Languages * Difference — Let L, and L, be languages over an alphabet 5. — The difference of L, from L», denoted by L, - L, = { x|x isin L, and x is not in Ly}. — Example- + Li={ xe {0,1}*| x begins with 0}, Ly = { xe {0,1}*| x ends with 0} + L,-L, = {xe{0, 1}*| x begins with 0 and does not end with 0} = {01, 011, 001, 0001, 0011, 0101, 0111, * Complement — Let LZ be a language over an alphabet 5. — The complement of L, denoted by ZL = £*-L — Example: Let £ = {0, 1} be the alphabet. + L, = {weX* | the number of 1’s in @ is even}. = {@eL* | the number of 1’s in @ is not even} ={we* | the number of 1’s in @ is odd}. Operations on Languages : Reversal Let L be a language over an alphabet 5. The reversal of L, denoted by L' = {| wis in L} Example-1 — L={x € {0,1}*] x begins with 0} — Lt =e {0,1}*]x ends with 0} Example-2 — L={x © {0,1}*]xhas 00 as a substring} — L'= {x € {0,1}*] x has 00 as a substring} Operations on Languages :Concatenation + Let Z, and L, be languages over an alphabet 5. + The concatenation of L, and L;, denoted by L;-L) = {w,-w3 w, is in L, and w, is in L,}. Example-1 — L= {a,b}, L,= (1,11, 01} — LyL)= fal, all, a01, b1, b11, b01} - L, = {la, 1b, 1a, 11b, Ola, 01b} Example-2 — L,= {x {0,1}*| x begins with 0} — Ly = {x © {0,1}4] x ends with 0} — LyL= {x € {0,1}*| x begins and ends with 0 and length(x) > 2} — LL, = {x € {0,1}*] x has 00 as a substring} Operations on Languages: Kleene’s closure (Star closure) of a Language * Let Z bea language over an alphabet ©. © Kleen’s closure of Z is obtained by concatenating zero or more elements of L. © The Kleene’s closure of ZL, given by L*= {x | Foran integer > 0, x= 2; xy... x, and x),.X), ...,%, are in L} L*= UP.) i= LOL UU © Lis different from 2*(?). © Example: * X= {0,1} and L, = {@eLX* | the number of 1’s in @ is even} © -E*={A,0,1, 00, 01, 10, 11,....} ° L,* = {eX* | the number of 1’s in @ is even} © Let = {a,b} and L= {a"b": n >= 0} Deh" Bt ee #L, L?= {e, ab, aabb, aaabbb, abaabb, abaaabbb, == {} L* = {a%b'a™b™ : n, m >=0} -}{e, ab, aabb, aaabbb, .. } = {e, ab, aabb, aaabbb.. Operations on Languages: Positive Closure of Language + Let Z be a language over an alphabet 5. The positive closure of Z is obtained by concatenating one or more elements of L. + The positive closure of L, given by L* = {x |for an integer m > 1, x=x)xp...x, and x), x), ..., x, are in L} + Thatis, L*= Uf, Li =L.L* =! UL? UL. + D(a} + Lk= The set of strings obtained by concatenating k elements of L Example: Let © = {0, /} be the alphabet. L, = {@e3* | the number of 1’s in a is even} <= {@eD* | the number of 1’s in is even} L,* = {@€* | the number of I’s in @ is even} vel” L + Find the closures of the following languages — L={a,b}, L,={11} Observations: Language Closure L=L'- 0}? L=LtuQay? + Example 1: — L={weX* | the number of 1's in w is even} — L*={eE* | the number of 1's ina is even} = L,* + Example 2: — L={1,0} — L*={1,0, 00, 01, 10, 11, 000,....}, L,*={A, 1, 0, 00, 01, 10, 11, 000,.... } Observations 1. If the language L does not contain i, then L* and L* are not equal. 2. Ifthe language L contains 4, then L* = L* Grammars Mathematical mechanisms to describe the languages Grammars English grammar tells us if a given combination of words is a valid sentence syntactically and semantically. Example: A mouse wrote a poem. — Froma syntax point of view, it is a valid sentence. perhaps in Disney land — From a semantics point of view, It is not valid The syntax of a sentence is concerned with its structure. The semantics of a sentence is concerned with the meaning of the sentence. Natural languages (English, Hindi, French, Urdu, German, etc.) — Very complex rules of syntax — Not necessarily well-defined — Ambiguous Formal Language/Grammar Formal grammar — It is used to identify correct or incorrect strings of alphabets in a language. — It is used to generate all possible strings over the alphabet that is syntactically correct in the language — Itis used mostly in the syntactic analysis phase (parsing), particularly during the compilation. + A grammar implies an algorithm that would generate all valid sentences of the language. — Offen, it takes the form of a set of recursive definitions. Formal Language-The set of all strings generated by a grammar. + Formal languages provide models for both natural languages and programming languages. — Amazon Alexa — Siri (Apple) — Compilers Grammars ‘Two key questions: . 1. Is acombination of words a valid sentence(string) in a formal Example: The English language. |" janguage? Grammars express languages 2. How can we generate valid sentences in formal language? Rule: 1. Asentence ——consists of a noun phrase followed by a predicate. 2. Anoun phrase consists of an article and noun. 3. Ete. Grammar’s Production Rules Casrictey Zz (sentence) — (noun_ phrase) (predicate) (article) —> the K2un> —> car (noun _ phrase) — (article) (noun) <(nmourn> —> doz kro) —> box ( predicate} —» (verb) (vera) —> rene Kverb) —> walks (verb) —> sleeps Derivation A derivation of “the dog walks”: Grammar Production Rules (sentence) => (noun phrase) (predicate) Sentence) —> (noun _ phrase) (predicate => (noun _ phrase) (verb) ‘noun _ phrase) —> (article) (noun = (article) (noun) (verb) predicate)» (verb) => the (noun) (verb) => the dog (verb (article) >a => the dog walks (article) —> the (noun) — cat A derivation of “a cat runs”: (noun) —> dog [(sentence) = (noun_ phrase) (predicate) (noun) > boy => (noun _ phrase) (verb) (verb) > runs (verb) > walks (verb) > sleeps => (article) (noun) (verb) =a (noun) (verb) =a cat (verb) >a cat runs Derivation * The following sentences can be derived- — “the boy sleeps” — “the boy runs” — “the boy walks” ° “the boy eats” ???? + “the boy eats pizza” ???? Grammar Production Rules Sentence) > (noun _ phrase) (predicate noun _ phrase) -> article) (noun predicate > (verb + The language of the grammar is the set of all strings (sentences) generated by the grammar. L=Lg= { “acat runs”, “a cat walks”, “the cat runs”, “the cat walks”, “a dog runs”, “a dog walks”, (article) >a (article) > the (noun) — cat (noun) > dog (noun) > boy (verb) > runs (verb) > walks (verb) > sleeps| “the dog runs”, “the dog walks”, “the boy sleeps”, Notation Production Rules G \ YY (noun) — cat oe — dog Variable Terminal Grammar Production Rules Sentence) > (noun phrase) (predicate noun _phrase) -> (article) (noun predicate) —> (verb (article) >a (article) > the (noun) > cat (noun) > dog (noun) > boy (verb) > runs (verb) > walks (verb) > sleeps| Terminal = {a, the, dog, cat, boy, runs, walks, sleeps} Variable = {, , ,
, , } Grammar A grammar G is defined as quadruple G =(V,7,S,P) ve + T--- Finite and non-empty set of terminal symbols + §--- Start variable, S eV + P --- Finite set of Production rules inite and non-empty set of variables (non-terminals) Note: + Vand T are disjoint sets Note: + Every production rule must contain at least one non-terminal on its left side. + Form of production rule x— ey xe (VUT)*V(VUT)" ye(VUT)* * Production rules- — Specify how to generate a string. — Specify how the grammar transforms one sentential form into another sentential form/string. — Define the language associated with grammar. Production Rules Production rule Pi: xX— > y Ww =uxy, Z=uyv, 5x, y, u, ve(T U V)* Production rule P, is applicable to string w — x can be replaced with y, to form a new string This can be written as: w — z w derives z Grammar-Example Grammar: G What strings/sentences can be generated with this grammar? t T,S, P) (What is the language generated by WN this grammar ?) VE{S} T={a,b} P={SaSb, SA} Sentential Form: * A finite sequence of variables and terminals. Sentence/String: * A finite sequence of terminals. Derivation of string: S=>aSb=> aasbb => aaaSbbb => aaabbb to. Sentential Fornis Sentence/String More Notation We write: S => aaabbb Instead of |§ => aSb => aaSbb => aaaSbbb => aaabbb The * indicates an unspecified number of steps (Zero or more steps). if wow >w3>:->MW, Then, we canwrite w, => w, By default: w>w Example Derivations Grammar S — aSb * a S>A SrA 2 S=>ab * S—=>aabb Derivation 1 * s—>aaSbh S => aaabbb aaSbb=> aaaaaSbbbbb Another Grammar Example Grammar G: S —> Ab A—aAb A>A Derivations: S=Ab=>b S=Ab=>aAbb=>abb S = Ab=>aAbb=>aaAbbb=>aabbb Derivations: §' => Ab => aAbb => aaAbbb => aaaAbbbb => aaaaAbbbbb => aaaabbbbb A> aAb > aaAbb > aabb ??? Language of a Grammar + Let G=(V, T, S, P) be a grammar. The language generated by G, denoted by L(G), is the set of all strings that can be derived from the starting variable S. L(G)= {w e T* | S=>*w} That is, L(G) is simply the set of strings of terminals that can be derived from the start symbol. + Example- G=(V,T,S,P); V=iA, St, T={a, b}, Sis a start symbol P={S— aA, S— b, A— aa}. T* = {e, a, b, aa, bb, ab, ba, aaa, aab, aba, abb, baa, bab,...... } + The language of this grammar is given by L (G) = {b, aaa} 1. we can derive aA from S using S > aA, and then derive aaa using A > aa, 2. Wecan also derive b using S > b. Language of a Given Grammar: How to prove ? * L-Agiven language * L(G) — Language of a grammar * Every string generated by Gis in Li.e., L(G) & L. * Every string in L can be generated by Gi.e., L & L(G). G=(V,T,S, P) Vv ={S} S — aSb T = {a, b} SrA P={S>aSb]A} L={a"b” :n=0O} Language of a Given Grammar S > aSb What’s the language of the grammar G ? SoA S => aSb => aaSbb => aaaSbbb => aaabbb S => aSb = aaSbb = aaaSbbb => aaaaSbbbb = aaaabbbb L={a"b” :n=0O} Find languages of the following grammars- © Gl: S> aaA|A, A DbS —~ L={(aaby": n>=0} * G28 Aa, A> B, BD Aa~L={} + G3:S> a|bjaSa| bSb|% —-L= { wis in {a, b}* : wis palindrome} Equivalent Grammars Two grammars G, and G, are equivalent if they generate the same language, that is, if L(G,) = L(G,) First grammar G, S — aSb L(G1) = {a"b" : n >= 0} SrA Second grammarG, § —aAb | A L(G2) = {a"b" : n >= 0} A> adblA The language generated by both grammarsis ZL ={a"b”" :n=0} Grammars G, and G, are equivalent. Grammar for a given Language Procedure- = Write grammar rules to generate all strings in the language. Write production rules that generate the strings in the language. 1. L={we {a}*}={e, a, aa, aaa, aaa, } seas 2. L= {we {a, b}*} = (e, a, b, aa, ab, ba, bb, ...... } S>e|aS| bs 3. L={we {a, b}*: w has exactly one a} = {a, ab, ba, abb, bab, bba S>Xax X>bX Je 4. L={w € {a, b}*: w has at least one a} = {a, aab, bab, aa, aaa, .....aaaabbaaa.... } S > Xax,X > aX | bX ]e 5. L={w € {a, b}*: w has no more than three a’s} = {e, a, b, aab, ab, bba, babab, bbbb,... } Se] X | Xax | XaXax | XaXaXax, XD bX | e 6. L={w e {a, b}*: n,(w) = ny(w)} = {e, ab, aabb, ba, aabaabbb, aabbab, Se | aSb | bSa| SS Thank You

You might also like