0% found this document useful (0 votes)
275 views15 pages

WRAP - Cs RR 099

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
275 views15 pages

WRAP - Cs RR 099

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Original citation:

Rytter, W. (1987) 100 exercises in the theory of automata and formal languages.
University of Warwick. Department of Computer Science. (Department of Computer
Science Research Report). (Unpublished) CS-RR-099

Permanent WRAP url:


http://wrap.warwick.ac.uk/60795

Copyright and reuse:


The Warwick Research Archive Portal (WRAP) makes this work by researchers of the
University of Warwick available open access under the following conditions. Copyright ©
and all moral rights to the version of the paper presented here belong to the individual
author(s) and/or other copyright owners. To the extent reasonable and practicable the
material made available in WRAP has been checked for eligibility before being made
available.

Copies of full items can be used for personal research or study, educational, or not-for-
profit purposes without prior permission or charge. Provided that the authors, title and
full bibliographic details are credited, a hyperlink and/or URL is given for the original
metadata page and the content is not changed in any way.

A note on versions:
The version presented in WRAP is the published version or, version of record, and may
be cited as it appears here.For more information, please contact the WRAP Team at:
[email protected]

http://wrap.warwick.ac.uk/
THE
UNIVP1SiTY " '

0 4 DEC 1987

Research report 99

100 EXERCISES IN THE THEORY OF AUTOMATA

AND FORMAL LANGUAGES

Wojciech Rytter*

(RR99)

Abstract

We present a collection of a hundred simple problems in the theory of automata and formal
languages which could be useful for tutorials and students interested in the subject. Solutions tc
these problems require only the knowledge of an Introductory course In automata and
formal languages which Is usually taught for second or third year students of computer
science. However some of the exercises require deeper understanding of the subject and
some sophistication. Most of the questions are about regular languages and finite
automata, and context-free languages and pushdown automata. A small collection of
problems concerning various interesting properties of strings is also Included in the section
'miscellaneous'. There are no problems related to decidability or the complexity of
algorithms. The collection can be useful also because there are presently no exercise-books
in the theory of automata and formal languages.

° Institute of Informatics
Warsaw University, and

Department of Computer Science


University of Warwick
Coventry
CV4 7AL, UK

April 1987
1

100 EXERCISES IN THE THEORY OF AUTOMATA AND


FORMAL LANGUAGES

Wojciech Rytter

Institute of Informatics, Warsaw University and


Dept.of Computer Science, University of Warwick

We present a collection of one hundred simple problems in the theory of automata and formal
languages which could be useful for tutorials and students interested in the subject . Solutions to
these problems require only the knowledge of an introductory course in automata and formal
languages which is usually taught for second or third year students of computer science. However
some of the exercises require deeper understanding of the subject and some sophistication. Most of
the questions are about regular languages and finite automata, and context-free languages and
pushdown automata. A small collection of problems concerning various interesting properties of
strings is also included in the section 'miscellaneous'. There are no problems related to decidability
or to the complexity of algorithms. The collection can be useful also because there are at present no
exercise-books in the theory of automata and formal languages.

Many problems were selected from the following books:

S.Ginsburg. The mathematical theory of context-free languages. McGraw-Hill Book Co. (1966)
M.A.Harrison. Introduction to formal languages and automata. Addison-Wesley (1978)
J.E.Hoperoft and J.D.Ullman. Introduction to automata theory, formal languages and computation.
Addison-Wesley (1979)
M.Lothaire. Combinatorics on words. Encyclopedia of Mathematics and Its Applications, vol.17,
Addison-Wesley (1983)
A.Salomaa. Jewels in formal language theory. Computer Science Press (1981)
2

Regular languages and finite automata

1. For a given string w of nonzero length n construct an (n+1)-state deterministic fmite automaton
Aw accepting all strings containing w as a substring. Aw is called the string-matching machine.

Take prefixes of w as states of Aw.

2. For a given set of strings {wl,w2,...wk) construct a finite automaton A with output such that

after reading ai....an A outputs the set of all indices i such that wi is a suffix of al...an. A should
have 0(n) states, where n=1w11+... +1wkl. We assume that the size of the alphabet and the number k
are constants. A is called the pattern-finding machine.
The structure of A can be based on the tree of prefixes of patterns.

3. Let G be a digraph, construct a deterministic finite automaton accepting the set of all paths of G
(paths are sequences of nodes, two consecutive nodes have to be adjacent).

4. Construct a deterministic finite automaton accepting the language


L =-(x.y : x, y are binary strings, [x.y]2 ),

where q is a given rational numbers and [x.y]2 is a number represented by x.y in binary. The least
significant digit is on the right.

5. Prove that if the number q is not rational then the language Lq from the previous exercise is not
regular.

6. Construct a deterministic finite automaton accepting the languge

L=(xcy: (a[x]2+b[y]2) mod p = r, x,y e (0,1)+),


where a,b,p,r are given natural numbers .

7. Let U be a regular language, and U(i), i=1..n, be arbitrary languages over the same finite
alphabet (U(i) could be nonregular). Construct a deterministic finite automaton accepting the
language

L=f ili2...ik: U 2 U(i )©U(i2)©...U(ik) }, where © is the operation of language concatenation.

8. Show that if L is accepted by a nondeterministic finite automaton with n states then L is also
accepted by a deterministic finite automaton with 211 states.
3

9. Show that if L is accepted by an n-state deterministic finite automaton then LR is accepted by a

deterministic finite automaton with at most 2n states , where LR =(wR : w e L), wR is the mirror
image of the string w, e.g. abcR=cba.

10. Let L= ( w e (0,1) + : n-th symbol of w is 1 ). Construct an (n+2)-state deterministic finite


automaton accepting L, a 2n-state deterministic finite automaton accepting LR, and an (n+1)-state
nondeterministic finite automaton accepting LR.

11. Prove that every deterministic finite automaton accepting LR requires at least 2n states, where L
is the language from the previous exercise.

12. Show that a language accepted by a deterministic finite automaton with n nodes is infinite if and
only if it contains a word w of length n_..1wl<2n. Prove also that every regular infinite language
contains a language wiw2*w3 for some strings w1, w2, w3.

13. Prove that for any language L (which may be nonregular) over a finite alphabet the languages
sub(L) and sup(L) are regular, where sub(L) is the set of all subsequences of strings in L, and
sup(L) is the set of all supersequences of strings in L. For example abc is a subsequence of acbac.
(v is a supersequence of w, iff w is a subsequence of v).
Let 0 denote the relation 'to be a subsequence of, for example acd « badcbbdb. Assume that the
alphabet is finite. Use the following fact: every set of pairwise incomparable elements with respect
to 0 is finite; for every finite language L the languages sup(L) and sub(L) are regular.

14. Prove that for any language L (which may be nonregular) over one-element alphabet the
language L* is regular.

15. Let x54, iff x is a prefix of y. Prove that if L is a regular language then the set max(L) of
maximal elements of L (with respect to 5..) is also a regular language.
Prove that if L is a regular language then the set min(L) of its minimal elements
(with respect to is also a regular language.

16. Let h be a homomorphism. Prove that the set {h(w) : h(w)=h(y) for some string y, w#y ) is a
regular language.

17. We define the star-hight (sh, in short) of regular expressions (with operations of union,
concatenation and closure * ) as follows. If there is no operation * in the expression E then sh(E)=0,

sh(E*)=sh(E)+1, sh(E1uE2)=max(sh(E1),sh(E2)) and sh(E1-E2)=max(sh(E1),sh(E2)).


4

For a regular language L we define sh(L) = min{ sh(E) : E is a regular expression describing L }
Prove that for every regular language L over the one-letter alphabet we have sh(L)51.

18. Prove that if L is a regular language then cycle(L) is also regular, where cycle(L)= (uv: vu e L).

19. Let L be any regular language whose words have lengths divisible by three. For a string
x=x x2x3 , where 1x11=1x21=1x31, denote x(1)=x1,x(2)=x2,x(3)=x3.

Let Lor{x(i): x E L}, L(0=fx(i)x(i): xeL ).


Prove that L(i),L(2),L(3),L(1,2),L(2,3) are regular.

20. Prove that there exists a regular language L such that L(13) is nonregular.

21. For a given string w of nonzero length n construct a 2n-state deterministic finite automaton
accepting the set of all nonempty substrings of w. For a given string w of length n and a string x
define Sx to be the set of all positions ending an occurence of x in w. Use the following facts: the

sets Sx are nonoverlapping (if SxnSy*0 then Sx is a subset of Sy or Sy is a subset of Sx). There

are at most 2n sets Sx.

Use sets Sx as states of an automaton.

22. Prove that if L is regular then the language ( x : xxR e L } is also regular.

23. Prove that if L is regular then the language 41, = {x : xx e L } is regular.

24. Prove that if L is a regular language then the language root(L)={x : xn e L for some natural n} is
also regular.

25. Prove that if L is regular then {x: xy e L for some y with ly1=1x12} is a regular language.

26. Prove that if L is regular then {x: xy e L for some y such that ly1=21x1 } is also a regular
language.

27. Let Ll II L2 be the set of all strings w which can be decomposed into two disjoint subsequences

wl, w2 such that wl e Ll and w2 e L2, e.g. it can be w=abcabd, wl=cbd and w2=aba.
Prove that if Ll, L2 are regular then Ll II L2 is also regular.
5

28. Let L#= L u L II L u L II L II L u Find a regular language L over a two-letter alphabet


such that L# is nonregular.

29. Prove that every language accepted by a two-way finite automaton is regular. (The two-way
finite automaton can move its input head in both directions, left and right.)

30. Construct a finite automaton with output which reading the input (a1,b1) (a2,b2) (an,bn)
outputs c1c2....cn, where c=a+b , and a, b, c are numbers whose binary representation are ai_an,

bi...bn, ci...cn, respectively, with ai,bi,ci being the least significant digits. Assume for

simplicity that an=bn=0.


Solve the same problem, but this time for c=a-b. Assume that a>b (in the other case assume that the
output is meaningless).

31. Prove that the corresponding finite automaton with output does not exist if we replace the
operation "+" by multiplication (requiring only n digits of the product to be produced).

32. A set of integers is linear iff it is of the form {c+pn : n=0,1,2,... }. A set is semilinear iff it is a

finite union of semilinear sets. Let R be a regular language. Prove that the set (n : an c R } is a
semilinear set.

33. The language L has the finite power property iff Lk=Lk+1 for some natural n. Prove that a
nonempty language L has the finite power property iff Lk=L* for some natural k.

34. Prove that every infinite regular language L over the one-element alphabet has the finite power

property, if c is in L.

35. If g,h are homomorphisms then denote the language ( x : h(x)=g(x) I by E(g,h) . Languages of
this type are called equality languages.
Find a nonregular equality language over the alphabet (a,b).

36. Prove that the language {aibi : gcd(i,j)=1 } is not a regular language. (gcd is the abbreviation of
'greatest common divisor')

37. Assume that each production of the grammar G is of the form xA->yB or A->c, where x,y are
strings of terminal symbols, and A,B,C are nonterminal symbols . Prove that the language generated
by G is regular.
6

38. Prove that if A,B are given languages and A does not contain the empty word then the equation

X = AXuB has only one solution.

39. Let A be an nondeterministic finite automaton. The usual definition of acceptance is the
existence of an accepting path. Change this definition as follows: A accepts the input word w iff
every possible path of computation ends in an accepting state. Prove that the language accepted by A
is regular.

40. Let A be a nondeterministic one-way finite automaton without c-moves. Assign to the states of
A boolean operations 'not', 'and' and 'or'. One can imagine the computation of A for a given input
word w as an expression-tree T of possible computation paths, the nodes of T are situations
(configurations) which can occur. With each internal node of T associate the same operation as the
state of A corresponding to this node, the leaves have boolean values: true, if the corresponding state
is accepting, false, otherwise. One can compute now the boolean value of the root of T. We say that
A accepts w iff the root of T has the value true. Finite automata with such a definition of acceptance
are called alternating finite automata.
Prove that alternating automata accept only regular languages.
Assume that an alternating automaton A has states si,...,sn. Let x=(xi,...,xn) be vectors of boolean
values. Take boolean functions f(x) as states of a constructed automaton. If A is scanning the i-th
symbol of the input then f(x) is interpreted as a value of the root of the part of the computation tree
of A (of the height i) assuming that the leaf corresponding in this moment to si has value xi, for
i=1..n.

41. A one-pebble two-way deterministic finite automaton A can move its input head in both
directions and has the added capability of marking a tape square (scanned by its input head) by
placing a pebble on it. The move of the automaton depends on the present state, scanned input
symbol and the presence of the pebble on the tape square scanned. The pebble can be removed and
placed later on other square. Prove that two-way deterministic finite automata with one pebble
accept only regular languages.
Add two tracks to the input string indicating for each state p, the state q in which A will return if it
moves left or right from the given tape cell in state p, under the assumption that the pebble is not
encountered. Note that A operating on the augmented tape does not need its pebble. Make use of the
fact that if R is a regular language then h(R) is also regular. The homomorphism h can be used to
erase the additional tracks.
7

Context-free languages and pushdown automata

42. Prove that the language {anbncn : n1 } is not context-free. This is a classical example of
a simple language which is not context-free. Prove that every infinite subset of this language is not a
context-free language.

43. Construct a context-free grammar generating the complement of the language from the previous
exercise.

44. Prove that the language {xcy: x,y c {a,b}+, x=y } is not context-free.

45. Construct a context-free grammar generating the complement of the language from the previous
exercise.

46. Prove that the language {xxR : x e {a,b}+} cannot be accepted by a deterministic pushdown
automaton . Let A be a deterministic pushdown automaton. For each string w there is a string w'
such that A after reading ww' has the minimal (for all w') height of the stack. This means that A
after reading ww' uses only the information (top stack symbol, state) related to ww'. There are two
distinct words w1, w2 such that this information for w1w1' and w2w21 is the same. How A reacts
on the word wiwi' ( w2w2 )R.

47. Prove that the complement of the language from the previous exercise is a context-free
language.

48. Construct a one-nonterminal context-free grammar generating the language of all strings over
the alphabet {a,b} with the same number of a's as b's.

49. Show that the language {x$y: x is a subword of y; x,y are over the alphabet {a,b} } is not a
context-free language.

50. Show that the language { x$y: xR is a subword of y; x,y are over the alphabet {a,b} } is
context-free.

51. Prove that the complement of the previous language is not context-free.

52. Prove that the language { aibjck: j*k and Mc} is not a context-free language.

53. Let L={anbkanbk: n,lc_1}. Is L a context-free language ?


8

54. Find a context-free language L such that the language (x : xy e L for some y, lx1=Iy1 } is not
context-free.

55. Show that the language (x4tyR : x,y e (0,1)+,[y]2=[42+1 ) is a context-free language.

Is the language (x#y : x,y e (0,1)+,[y]2=[42+1 ) context-free ?

56. We say that a context-free grammar G is self-embedding iff there is a nonterminal symbol A
such that A->*xAy, for some nonempty strings x,y. Prove that if a context-free grammar G is not
self-embedding then G generates a regular language.

57. Prove that the set of all possible contents of the pushdown store of a nondeterministic
pushdown automaton starting with one-element pushdown store is a regular language.

58. Prove that intersection of a regular language R and a context-free language L is a context-free
language.
Prove that if L is deterministic (accepted by a deterministic pushdown automaton) then the resulting
language is also deterministic.

59. Find two context-free language's L1,L2 such that Ll II L2 is not a context-free language (see
exercise 27 for the definition of the operation II ).

60. Find a context-free language L such that L# is not a context-free language (see exercise 28 for
the definition of the operation # ).

61. Prove that if X, Y are regular languages then the language U>1 (-xn n yn)
is a context-free language.

62. Find regular languages X, Y such that tjiii (xn n yri) = t anbn : rii. j.
63. Find regular languages X, Y and Z such that the language U ril ocn r-) yn (-) zn,) is not
context-free.

64. Find a context-free language L such that the language 4I., is not context-free ("1 is the operation
from exercise 23).

65. Let h(a)=a, h(b)=b, g(a')=a, g(b')=b and h(x)=g(X)=e for all other symbols x. The language
L = E(g,h) is over the alphabet (a,b,a1,b1 ) and it is called the twin shuffle language. Every word in
9

L is a shuffle of two copies of the same word. Prove that the language L is not context-free.

66. Prove that if U is regular then {xyR: x*y, xy e U } is a context-free language.

67. Prove that if L is a context-free language and R is a regular language then L II R is a context-free
language (see exercise 27 for the definition of the operation H ).

68. Prove that every context-free language over the one-element alphabet is regular.

69. A language has the prefix property iff for each two strings one is a prefix of another. Prove that
every context-free language with the prefix property is regular.

70. Prove that if L is a context-free language then ( : w e L ) is a regular language.

71. Let (a,b)* L be a regular language, and hi., h2 be homomorphisms. Prove that

(hi(u)c(h2(u))R : u e U ) is a linear context-free language.


The language is linear if it can be generated by a context-free grammar in which right sides of
productions contain at most one nonterminal.

72. Prove that if L is a linear language and R is a regular language then 1..nR is a linear language.

73. Prove that the language from exercise 48 is not a linear language.
Use a stronger version of the 'uvwxy' lemma (pumping lemma), where Iuvxyl is bounded by a
constant.

74. Let hl, h2 be two homomorphisms whose values are words not containing the symbol '$'.
Prove that the languages { x$yR: h(x)=h(y)) and { x$yR: h(x)*Ii(y)) are linear context-free
languages.

75. Prove that if L is a deterministic context-free language then the language min(L) is also a
deterministic context-free language (the operation min is the same as in exercise 16).

76. Let L=[aibjck : or lc_j). Show that the set min(L) is not a context-free language.
Consequently L is not a deterministic context-free language.

77. Find a context-free language L such that max(L) is not a context-free language. Use a language
similar to the one from exercise 76.
10

78. Let the language L c (xcy : x,y e (a,b)* ) be generated by a linear context-free grammar, such
that the only production without nonterminal on the right side is of the form A -> c. Prove that {x :

xcy e L, for some y} is a regular language.

79. Prove that if (a,b)* U and the language L=tblxicx : x e U) is a context-free language then U
is regular. Let A be a pushdown automaton accepting L. Prove that A can be simulated by a one-turn
pushdown automaton A' (whose stack changes its mode from nondecreasing to nonincreasing only
when the symbol c is read). Then simulate by a finite automaton a part of the computation of A' on
the suffix cx of the input.
If we remove the letter c then U can be nonregular, consider L= (133nan : nz1 }.

80. Prove that if L is a context-free language then cycle(L) is also a context-free language (the
operation cycle was defined in exercise 18).

81. For given homomorphisms h and g describe context-free languages Ll, L2 and a

homomorphism f such that E(h,g)=f(L1nL2). Hence E(h,g)=0 if and only if L1nL2 =0. The
problem of checking emptiness of the language E(h,g) is called the Post correspondence problem.

82. A two-way deterministic pushdown automaton A (2dpda A, in short) is a deterministic


pushdown automaton whose input head can move in two-directions (left and right) and can detect
the left and right end of the input word. Construct a 2dpda which accepts the language (x$y : x is a
subword of y) (the string matching problem)

83. Let P=(xxR lxle ta,b}+} and V be the set of all words over the alphabet (a,b). P is the set of
nontrivial palindroms of even length over the alphabet (a,b).
Construct a 2dpda A which accepts the language PV (the language of prefix palindroms).

84. Construct a 2dpda which accepts the language P3V. Use the following fact:

if w e P*, first(w) = min( lxl: w=xy, x e P) and parse(w)= min( Ixl: w=xy, x e P, y e P*) then
first(w)=parse(w).
A is recomputing many times end-positions of the first and second prefix palindrom.

85. Construct a 2dpda which accepts the language P2 . Use the following fact: if x e P2 then x=yz

for some y,z such that y,z e P and y is the longest prefix of x which belongs to P, or z is the longest
suffix of x which belongs to P.
Use the 2dpda from exercise 83 to compute the longest prefix palindOm and the longest suffix
11

palindrom.

86. Construct a 2dpda which accepts the language (anbP(n) : n?.1), where p is a given polynomial
with natural coefficients.

Miscellaneous

87. The period of the string u is the smallest word v such that u is a prefix of vk for some k. Prove
that if x has periods of sizes p, q and p+q5.1xIthen x has a period of size gcd(p,q).
Use Euclid algorithm.
We say that v is a full period of u iff u=vk for some k. Prove that xy=yx if and only if x and y have
the same smallest full period.Let gcd'(u,v) denote the smallest common full period of u,v, if it
exists. Show that gcd'(u,v) exists iff uv=vu. If it exists then Igcd'(u,v)1=gcd(lul,Iv1).

88. Let fi=b, f2=a and fn+24n+ ifn. The strings fn are called Fibonacci words. Prove that
c(fn_ifn)=fnfn_i, where c is the operation of interchanging the last two letters of the word.

89. Let Pi={x : x = xR , Ixl>l, x e {a,b} }. Pi is the set of nontrivial palindroms. The elements of

Pi+ are called palstars. For every palstar w define

firsti(w) = min{ Ixl: w=xy, x e Pi ) and parsei(w)= min{ Ixl : w=xy, x e P1, y e Pi }.

Prove that firsti(w) E Iparsei(w), 2 parsei(w)+1, 2 parsei(w)-1 }.

90. Let h be a homomorphism whose domain is {a,b}*. Prove that h is one-to-one if and only if
h(ab)#h(ba).

91. Let h(a)=ab,h(b)=ba. Prove that hi(a) is a prefix of hi+1(a). Define H to be the infinite word
aoa1a2.... such that each hi(a) is its prefix. Prove that the n-th symbol of H is a if and only if the
number of ones in the binary representation of n is even. Prove that hi+1(a)=hi(a) Q(hi(a)), where Q
is a homomorphism Q(a)=b, Q(b)=a.

92. What is the cardinality of the set of all square-free (not containing subword of the form xx)
words over the two-letter alphabet .
12

93. Let {ab,ba)* D L. Prove that if x E L then axa and bxb do not belong to L.

Using this result prove that if w does not contain a subword of the form cvcvc (c e {a,b)), then
h(w) has the same property, where h is the homomorphism from problem 91.

94. Using the result of the previous exercise prove that the infinite word H contains no subword of

the form cvcvc (c e {a,b)). Hence H is cube-free (has no subword of the form xxx).

95. Let a(a)=a,a(b)=ab,a(c)=abb. Using the result of the previous problem prove that the infinite
word M=a-1(H) is well defined and M is square-free. Conclude that the set of square-free words
over three-letter alphabet is infinite.

96. The language L is a code if any product of words from L can be "decoded" in a unique way: if
the words wi, vi are from L and wiw2...wk=viv2...vi then k=j, and vi=wi, for i=1..k.
Let code-indicator of a word w be k-lwl, where k is the size of the alphabet. Let code-indicator
(ci(L), in short) of the language L be the sum of code-indicators of words in L.
Prove that if L is a code then ci(L)5.1.

97. The language L is said to be commutative iff xy=yx for every two words from L. Prove that if L

is commutative then w* D L, for some string w.

98. Let R be a given symmetric binary relation on the alphabet. We write x,----y iff the string x can be
obtained from y by applying several times the operation of exchanging adjacent letters a,b such that
R(a,b) holds. Let ha,b be a homomorphism ereasing all symbols except a,b. Prove that
x—y if and only if the following two conditions are satisfied:
(i) for each letter a carda(x)=carda(y);

(ii) for each pair of symbols a,b if R(a,b) does not hold then ha,b(x)=ha,b(Y)-

( cardc(x) denotes the number of occurences of the letter c in x).

99. Let XnY=0. Let R(a,b) hold iff (a e X, b e Y) or (a e Y, b e X). Prove that if L is a regular
*
language over XuY then { uv : u e X , v e Y*, uvR -- w for some w e IA is a context-free language.

100. The language L is said to be bounded iff L is a subset of wl*w2*...wk* for some words

wi,...,wk. Prove that the language {a,b)* of all words over the alphabet { a,b) is not bounded.

Consider
onsider the words w. =aba2b2a3b3...aibi , for j>1.
J

You might also like