0% found this document useful (0 votes)
12 views

Module 4

The document explains grammar in formal language theory, detailing its role in defining languages through sets of rules and various types of grammars, including regular, context-free, context-sensitive, and unrestricted grammars. It focuses on context-free grammar (CFG), its components, parsing techniques, and applications in programming languages and natural language processing. Additionally, it discusses feature-based grammar, which enhances CFG by incorporating grammatical features for more precise analysis.

Uploaded by

harditya.shah
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views

Module 4

The document explains grammar in formal language theory, detailing its role in defining languages through sets of rules and various types of grammars, including regular, context-free, context-sensitive, and unrestricted grammars. It focuses on context-free grammar (CFG), its components, parsing techniques, and applications in programming languages and natural language processing. Additionally, it discusses feature-based grammar, which enhances CFG by incorporating grammatical features for more precise analysis.

Uploaded by

harditya.shah
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

. What is Grammar in Formal Language Theory?

 A grammar is a set of rules that defines a language. It specifies


how valid sequences of symbols (or sentences) in a language can
be constructed.
 Grammars are essential in both natural language processing and
programming languages, as they determine the structure of valid
sentences or expressions.
 Types of Grammars:
o Regular Grammar: Produces Regular Languages, which
can be recognized by finite automata.
o Context-Free Grammar (CFG): Produces Context-Free
Languages, recognizable by pushdown automata and
parsable with parse trees.
o Context-Sensitive Grammar: More powerful than CFG,
used to define languages where context matters in
production.
o Unrestricted Grammar: The most general form, equivalent
to Turing Machines in computational power.

2. Context-Free Grammar (CFG)

Definition: A CFG is a set of production rules that describe all


possible strings in a context-free language. It is “context-free”
because each rule can be applied regardless of the surrounding
symbols.

Formal Components of a CFG:

o Non-terminal symbols (N): These represent syntactic


categories (like <Expression> or <Term>) and can be
expanded.
o Terminal symbols (Σ): These are actual characters or tokens
from the language (e.g., a, b, +, *).
o Start symbol (S): A special non-terminal from which
parsing begins.
o Production rules (P): Rules that describe how non-
terminals can be replaced by combinations of terminals and
non-terminals.
CFG Notation: CFGs are often written in Backus-Naur Form
(BNF) or similar notation:

o <Expression> → <Expression> + <Term>


o <Expression> → <Term>
o <Term> → <Term> * <Factor>
o <Term> → <Factor>
o <Factor> → ( <Expression> )
o <Factor> → a | b | c

Example CFG for Simple Arithmetic Expressions:

S→S+S
S→S*S
S→(S)
S→a|b|c

This grammar allows for arithmetic expressions involving +, *, and


variables a, b, and c.
3. Parsing with Context-Free Grammar (CFG)

 Parsing: Parsing is the process of analyzing a sequence of symbols


to determine its grammatical structure.
 Goal: Check if a string belongs to a language defined by the CFG
and produce a parse tree for that string.
 Parsing Techniques:

o Top-Down Parsing: Starts with the start symbol and applies


production rules to match the input string. Common methods
include Recursive Descent Parsing and LL Parsing.
o Bottom-Up Parsing: Begins with the input string and
applies production rules in reverse to reach the start symbol.
LR Parsing is a common bottom-up technique.

4. Detailed Example of Parsing with CFG

CFG for Arithmetic Expressions

S→S+S|S*S|(S)|a|b|c

 Start Symbol: S
 Non-terminals: {S}
 Terminals: {a, b, c, +, *, (, )}
 Production Rules:

o S → S + S (for addition)
o S → S * S (for multiplication)
o S → ( S ) (for grouping)
o S → a | b | c (for variables)
Example 1: Parsing (a + b) * c

Let's break down the parsing process step-by-step using a parse tree:

1. Initialize with Start Symbol S: We begin with S, which is the


starting point.
2. Apply Production Rule: S → S * S for multiplication.
3. Expand First S: Use the production rule S → ( S ) to match the
parentheses.
4. Expand Inside Parentheses: Apply S → S + S for the addition.
5. Match Terminals: Expand each S within the addition to a and b,
and the other outer S to c.

Applications of CFG and Parsing

 Programming Languages: CFGs are fundamental in defining the


syntax of programming languages (e.g., Java, Python).
 Natural Language Processing (NLP): CFGs help model the
syntax of human languages.
 Compilers and Interpreters: CFGs are used in lexical and
syntactic analysis, allowing compilers to check if code follows the
language’s syntax rules.
Feature-Based Grammar Overview

 Feature-Based Grammar (FBG) enhances context-free grammars


by associating features (e.g., number, tense, gender) with grammar
rules to allow more precise syntactic and semantic analysis.
 This grammar type is highly effective for languages where
syntactic structures are complex and where features like
agreement (in number, gender, etc.) need to be enforced between
different elements.
 Example Use Case:
o In English, a verb must agree in number with its subject,
while in languages like French, adjectives must also agree in
gender and number with nouns.

Grammatical Features

Grammatical Features are additional constraints that specify


properties of syntactic elements. Each feature is defined as an
attribute-value pair.

Key Grammatical Features:

o Number: Singular or plural (e.g., "cat" is singular, "cats" is


plural).
o Person: Indicates if the subject is first, second, or third
person (e.g., "I" is first person, "you" is second person).
o Tense: Specifies the timing of the action (past, present,
future).
o Gender: Masculine, feminine, neuter, which is common in
gendered languages (e.g., in Spanish, "niño" (boy) is
masculine, "niña" (girl) is feminine).
o Case: Indicates the syntactic role of a noun (e.g., nominative
for the subject, accusative for the object).
Example: The sentence "She walks" can be represented with the
following feature-based grammar:

o “She”: [Gender: feminine, Number: singular, Person: third]


o “walks”: [Tense: present, Number: singular, Person: third]
o The agreement between "She" and "walks" is enforced by
matching the number and person features.

Detailed Example with Different Features:

o Sentence: "The children are playing."

 "children": [Number: plural]


 "are": [Number: plural, Tense: present]
 "playing": [Tense: present participle, Aspect:
continuous]
 This structure enforces agreement by ensuring all
elements agree in number.

Processing Feature Structures

 Feature Structures are a formal representation of grammatical


features, typically in attribute-value pairs.
 Attribute-Value Pairs:

o Attributes (like Tense, Number, Gender) represent


grammatical categories.
o Values (like present, singular, masculine) specify properties
for those categories.
 Operations on Feature Structures:

o Unification is the primary operation in feature-based


grammar, combining compatible feature structures.

 When two structures unify, they merge their attribute-


value pairs; if they conflict, unification fails.

Example of Feature Structures and Unification:

 Feature Structure for "she": [Gender: feminine, Number: singular,


Person: third]
 Feature Structure for "walks": [Tense: present, Number: singular,
Person: third]

Unified Structure (only if all features match)

[Gender: feminine, Number: singular, Person: third, Tense: present]

If “walks” were plural instead of singular, unification would fail due to


mismatched number features.

You might also like