Atc 01 Module Notes
Atc 01 Module Notes
(Approved by AICTE, New Delhi, Affiliated to VTU Belagavi & Recognized by Govt. of Karnataka)
Strings
A string (word) is a finite sequence of symbols chosen from some alphabet.
Example 01101 is a string from the binary alphabet Ʃ= {0, 1}. The string 111 is another string
chosen from this alphabet.
Empty String
The empty string is the string with zero occurrences of symbols. This string denoted ɛ is a string
that may be chosen from any alphabet whatsoever.
Length of a String
Length of string is the number of positions for symbols in the string. It is common to say that the
length of a string is “the number of symbols” in the string.
The standard notation for the length of a string w is |w|
Example |10101| = 5
|ɛ| = 0
Powers of an Alphabet
If Ʃ an Alphabet If is an alphabet we can express the set of all strings of a certain length from that
alphabet by using an exponential notation we de ne k to be the set of strings of length k each of
whose symbols is in Ʃ.
Concatenation of Strings
Let x and y be strings, then xy denotes the concatenation of x and y, that is the string formed by
making a copy of x and following it by a copy of y.
Example: Let x = 01101 and y = 110. Then xy = 01101110 and yx = 11001101.
For any string w, the equations ɛw = wɛ = w hold. That is, ɛ is the identity for concatenation.
Languages
A set of strings all of which are chosen from some Ʃ*, where Ʃ is a particular alphabet, is called a
language.
Example: i) The language of all strings consisting of n 0’s followed by n 1’s for some n ≥ 0: {ɛ,
01, 0011, 000111, ………}.
ii) The set of strings of 0’s and 1’s with an equal number of each:
{ɛ, 01, 10, 0011, 0101, 1010, 1001, …..}
Faculty Name: BASAVARAJ K. M. Subject: Automata Theory & Computability
Jain College of Engineering & Research Belagavi
(Approved by AICTE, New Delhi, Affiliated to VTU Belagavi & Recognized by Govt. of Karnataka)
Problems
In automata theory, a problem is the question of deciding whether a given string is a
member of some particular language.
If Ʃ is an alphabet, and L is a language over Ʃ, then the problem L is: Given a string w in Ʃ*
decide whether or not w is in L.
Kleene Star
Definition: The Kleene star denoted by Σ*, is a unary operator on a set of symbols or strings,
Σ, that gives the infinite set of all possible strings of all possible lengths over Σ including ε.
Representation: Σ* = Σ0 U Σ1 U Σ2 U……. where Σp is the set of all possible strings of length p.
Example: If Σ = {a, b}, Σ*= {ɛ, a, b, aa, ab, ba, bb,………..} Kleene Star
A Language Hierarchy
A Machine-Based Hierarchy of Language Classes are shown in the diagram. We have four
language classes:
1. Regular languages, which can be accepted by some finite state machine.
2. Context-free languages, which can be accepted by some pushdown automaton.
3. Decidable (or simply D) languages which can decided by some Turing machine that always
halts.
4. Semi-decidable (or SD) languages, which can be semi-decided by some Turing machine that
halts on all strings in the language.
Each of these classes is a proper subset of the next class, as illustrated in the Figure. As we move
outward in the language hierarchy, we have access to tools with greater and expressive power. We
can define AnBnCn as a decidable language but not as a context-free or a regular one. These matters
because expressiveness generally comes at a price. The price may be: Computational efficiency,
decidability and clarity.
• Computational efficiency: Finite state machines run in time that is linear in the length of the input
string. A general context-free parser based on the idea of a pushdown automaton requires time that
grows as the cube of the length of the input string. A Turing machine may require time that grows
exponentially (or faster) with the length of the input string.
• Decidability: There exist procedures to answer many useful questions about finite state machines.
For example, does an FSM accept some particular string? Is an FSM minimal? Are two FSMs
identical? A subset of those questions can be answered for pushdown automata. None of them can
be answered for Turing machines.
• Clarity: There exist tools that enable designers to draw and analyze finite state machines. Every
regular language can also be described using the regular expression pattern language. Every
context-free language, in addition to being recognizable by some pushdown automaton, can be
described with a context-free grammar
differences matter anymore. If they've driven M to the same state, they share a fate. No matter what
comes next, either all of them because M to accept or all of them cause M to reject.
Regular Languages
A regular language is a language that can be expressed with a regular expression or a
deterministic or non-deterministic finite automata or state machine. A language is a set
of strings which are made up of characters from a specified alphabet, or set of symbols.
Regular languages are a subset of the set of all strings. Regular languages are used in
parsing and designing programming languages and are one of the first concepts taught in
computability courses. These are useful for helping computer scientists to recognize patterns in data
and group certain computational problems together once they do that, they can take similar
approaches to solve the problems grouped together.
A language is regular if it is accepted by some DFSM. Some examples are listed below.
• {w є {a, b}* | every a is immediately followed by b}.
• {w є {a, b}* | every a region in w is of even length}
• Binary strings with odd parity.
In the 1st choice, converting an NDFSM to a DFSM can be very inefficient in terms of both time
and space. If M has k states, it could lake time and space equal to O(2k) just to do the conversion,
although the simulation after the conversion would take time equal to O(|w|). So we would like to
follow 2nd choice that directly simulates an NDFSM M without converting it to a DFSM first. The
idea is to simulate being in sets of states at once. But instead of generating all of the reachable sets
of states right away, as ndfsm-to-dfsm does, it generates them on the fly as they are needed, being
careful not to get stuck chasing ɛ-loops.
An automaton that produces outputs based on current input and/or previous state is called a
transducer. Transducers can be of two types:
➢ Moore Machine The output depends only on the current state.
➢ Mealy Machine The output depends both on the current state and the current input.
Moore Machine
Mealy Machine
It is composed of columns, each of the same width. A column can either white or black. If two
black columns occur next to each other, it will look to us like a single-wide-black column, but the
reader will see two adjacent black columns of the standard width. The job of the white columns is to
delimit the black ones. A single black column encodes 0. A double black column encodes 1.
We can build a finite state transducer to read such a bar code and output a string of binary digits.
We will represent a black bar with the symbol B and a white bar with the symbol W. The input to
the transducer will be a sequence of those symbols corresponding to reading the bar code left to
right. We'll assume that every correct bar code starts with a black column, so white space ahead of
the first black column is ignored. We will also assume that after every complete bar code there are
at least two white columns. So, the reader should, at that point, reset to be ready to read the next
code. If the reader sees three or more black columns in a row, it must indicate an error and stay in
its error state until it is reset by seeing two while columns.
Interpreters for finite stale transducers can be built using techniques similar to the ones that we used
to interpreters for finite state machines.
Bidirectional Transducers
A process that reads an input string and constructs a corresponding output string can be
described in a variety of different ways. Why should we choose the finite state transducer model?
One reason is that it provides a declarative, rather than a procedural, way to describe the
relationship between inputs and outputs, such a declarative model can then be run in two directions.
For example, to read an English text requires transforming a word like "liberties" into the
root word "liberty" and the affix PLURAL. To generate an English text requires transforming a root
word like "liberty" and the semantic marker “PLURAL” into the surface word "liberties". If we
could specify, in a single declarative model, the relationship between surface words (the ones we
see in text) and underlying root words and affixes, we could use it for either application.
The facts about English spelling rules and morphological analysis can be described with a
bidirectional finite state transducer. If we expand the definition of a Mealy machine to allow non-
determinism, then any of these bidirectional processes can be represented. A nondeterministic
Mealy machine can be thought of as defining a relation between one set of strings (for example,
English surface words) and a second set of strings (for example. English underlying root words
along with affixes). It is possible that we will need a machine that is nondeterministic in one or both
directions because the relationship between the two sets may not be able to be described as a
function.
Example:
When we define a regular language, it doesn't matter what alphabet we use. Anything that is true of
a language L defined over the alphabet {a,b} will also be true of the language L' that contains
exactly the strings in L except that every a has been replaced by a 0 and every b has been replaced
by a L We can build a simple bidirectional transducer that can convert strings in L to strings in L'
and vice-versa.
Of course the real power of bidirectional finite state transducers comes from their ability to model
more complex processes.