0% found this document useful (0 votes)
2 views9 pages

CD Overview

The document provides an overview of Compiler Design theory, covering Language Processing, Lexical Analysis, Semantic Analysis, and Intermediate Code Optimization. Key points include the roles of preprocessors, compilers, and interpreters in converting high-level code, the importance of semantic checks and symbol tables, and various optimization techniques for intermediate code. It emphasizes the structure and phases of a compiler, as well as methods for efficient memory management and code execution.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views9 pages

CD Overview

The document provides an overview of Compiler Design theory, covering Language Processing, Lexical Analysis, Semantic Analysis, and Intermediate Code Optimization. Key points include the roles of preprocessors, compilers, and interpreters in converting high-level code, the importance of semantic checks and symbol tables, and various optimization techniques for intermediate code. It emphasizes the structure and phases of a compiler, as well as methods for efficient memory management and code execution.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

A clean and detailed overview of your Compiler Design theory topics (from Units I, IV, and V),

with 3–6 key points per topic for better understanding and memory retention.

✅ UNIT – I: Language Processing & Lexical Analysis


1. Overview of Language Processing

●​ Converts high-level source code into machine code.​

●​ Main language processors: Preprocessors, Compilers, Assemblers, Interpreters,


Linkers & Loaders.​

●​ Ensures syntax, semantics, and runtime correctness.​

2. Preprocessors

●​ Perform operations before compilation (e.g., macros, file inclusion).​

●​ Common in C/C++: #include, #define.​

●​ Helps modularize code and reduce redundancy.​

3. Compiler

●​ Translates entire source code to machine code in one go.​

●​ Faster execution after compilation.​

●​ Detects syntax & semantic errors.​

●​ Generates intermediate code, optimizes it, and produces target code.​


4. Assembler

●​ Converts assembly language into machine code.​

●​ Handles mnemonics and symbolic addresses.​

●​ Produces object code (.obj or .o files).​

5. Interpreter

●​ Executes line-by-line, without producing separate machine code.​

●​ Slower than compilers.​

●​ Immediate feedback: stops on first error.​

●​ Used in Python, Ruby, JavaScript.​

6. Linker & Loader

●​ Linker: Combines object files into a single executable, resolves external references.​

●​ Loader: Loads executable into memory for execution.​

●​ Responsible for address binding and relocation.​

7. Structure of a Compiler

●​ Front End: Lexical → Syntax → Semantic → Intermediate Code Generation.​

●​ Middle End: Code Optimization.​

●​ Back End: Code Generation → Machine Code.​

●​ Also includes Symbol Table, Error Handler, and Intermediate Representations (IR).​
8. Phases of a Compiler

1.​ Lexical Analysis​

2.​ Syntax Analysis​

3.​ Semantic Analysis​

4.​ Intermediate Code Generation​

5.​ Optimization​

6.​ Code Generation​

7.​ Linking & Loading​

9. Lexical Analysis

●​ Converts source code into tokens.​

●​ Removes whitespace/comments.​

●​ Detects lexical errors.​

●​ Tokens include keywords, identifiers, literals, etc.​

10. Lexical Analysis vs. Parsing


Feature Lexical Analysis Parsing (Syntax
Analysis)

Unit Characters → Tokens → Parse Tree


processed Tokens

Goal Tokenization Grammar rule validation

Output Stream of tokens Syntax tree


11. Token, Pattern, Lexeme

●​ Token: Category/type (e.g., IDENTIFIER, NUMBER).​

●​ Pattern: Rule to identify a token (e.g., regex).​

●​ Lexeme: Actual string in code (e.g., x, 123, main).​

12. Lexical Errors

●​ Caused by invalid symbols or illegal characters.​

●​ Example: int @value = 5; → Error: @ is not valid.​

13. Regular Expressions

●​ Define patterns for tokens.​

●​ Operators: * (0 or more), + (1 or more), | (or), () for grouping.​

●​ Example: Identifier regex → letter (letter | digit)*​

14. Regular Definitions for Language Constructs

●​ Identifiers: letter (letter | digit)*​

●​ Numbers: digit+​

●​ Comments: /* ... */ or // ...​

●​ Used to generate transition diagrams for token recognition.​


15. Transition Diagrams

●​ Graphical representation of how tokens are recognized.​

●​ States → Transitions based on input characters.​

●​ Used to implement Finite Automata for lexical analyzers.​

16. Reserved Words vs Identifiers

●​ Reserved words: Fixed by language (e.g., if, while, return).​

●​ Identifiers: User-defined names (e.g., sum, main).​

✅ UNIT – IV: Semantic Analysis & Runtime Environment


1. Semantic Analysis

●​ Ensures meaningful logic in code.​

●​ Checks for type errors, undeclared variables, etc.​

●​ Uses syntax-directed translation (SDT) and attribute evaluation.​

2. Syntax Directed Translation (SDT)

●​ Attaches semantic rules to grammar productions.​

●​ Builds annotated syntax trees.​

●​ Example: E → E1 + E2 { E.val = E1.val + E2.val }​


3. Evaluation of Semantic Rules

●​ Synthesized attributes: Computed from children nodes.​

●​ Inherited attributes: Passed down from parent nodes.​

●​ Evaluated in a parse tree or abstract syntax tree.​

4. Symbol Tables

●​ Stores identifiers with attributes (type, scope, address).​

●​ Supports insertion, lookup, and scope management.​

●​ Essential for semantic analysis and code generation.​

5. Storage Organization

●​ Code, static data, stack, heap.​

●​ Stack: For function calls and local variables.​

●​ Heap: For dynamic memory allocation.​

6. Access to Non-local Data

●​ Uses static links, displays, or access links.​

●​ Helps access variables from outer scopes or parent functions.​

7. Heap Management
●​ Allocation: malloc(), new​

●​ Deallocation: free(), delete​

●​ Needs garbage collection in some languages.​

8. Parameter Passing Mechanisms

●​ Call by value: Pass copy of value.​

●​ Call by reference: Pass address.​

●​ Call by name: Expression passed unevaluated (rare).​

●​ Affects memory and runtime behavior.​

✅ UNIT – V: Intermediate Code & Optimization


1. Intermediate Code

●​ Abstract form between source and machine code.​

●​ Easy to optimize and portable.​

●​ Example: Three-address code (TAC).​

2. Three Address Code (TAC)

●​ Format: x = y op z​

●​ Uses temporary variables: t1 = a + b;​

●​ Simple and flexible for optimization.​


3. Quadruples & Triples

●​ Quadruples: (op, arg1, arg2, result)​


→ (+, a, b, t1)​

●​ Triples: (op, arg1, arg2), result is implicit by index.​

4. Abstract Syntax Tree (AST)

●​ Tree structure that represents the program's syntax.​

●​ Compact, omits unnecessary details like parentheses or commas.​

5. Basic Blocks & Control Flow Graph (CFG)

●​ Basic Block: Sequence of instructions with no jump/branch inside.​

●​ CFG: Nodes = basic blocks, Edges = control flow (jumps, branches).​

●​ Used in flow analysis and optimizations.​

6. Machine Independent Code Optimization

●​ Common Subexpression Elimination: Avoid recomputing expressions.​

●​ Constant Folding: Compute constants at compile time.​

●​ Copy Propagation: Replace variable copies.​

●​ Dead Code Elimination: Remove unreachable code.​

●​ Strength Reduction: Replace expensive ops with cheaper ones.​


●​ Loop Optimization: Unrolling, invariant code motion.​

●​ Procedure Inlining: Replace call with function body.​

7. Machine Dependent Code Optimization

●​ Peephole Optimization: Small, local code improvements.​

●​ Register Allocation: Assign variables to CPU registers.​

●​ Instruction Scheduling: Rearranging instructions for pipeline efficiency.​

●​ Inter-procedural Optimization: Across function boundaries.​

●​ Garbage Collection: Free memory, e.g., reference counting.​

You might also like