Compiler Design (MCA554) – Unit 3: Syntax Directed Translation and Intermediate Code Generation
Compiler Design (MCA554) – Unit 3: Syntax Directed Translation and Intermediate Code Generation
Based on MCA Semester III Syllabus
---
Introduction to Syntax Directed Translation
Syntax Directed Translation (SDT) is a method in which semantic actions are associated with grammar rules.
It helps the compiler:
Generate intermediate code
Perform type checking
Build syntax trees
Translate source code into machine-independent representation
---
Need for Syntax Directed Translation
A parser only checks syntax.
The compiler must also:
Understand meaning
Generate code
Perform semantic analysis
Syntax Directed Translation helps achieve these tasks.
---
Syntax Directed Definition (SDD)
A Syntax Directed Definition is a Context Free Grammar (CFG) with attributes and semantic rules attached to productions.
General Form:
A → B C
Semantic Rule:
A.val = B.val + C.val
---
Attributes
Attributes store information about grammar symbols.
Types:
Information flows upward in the parse tree.
Example:
E.val = E1.val + T.val
---
Information flows downward or sideways.
Example:
T.type = E.type
---
Syntax Tree
A Syntax Tree is a compact representation of a program structure.
Unlike Parse Trees:
Keywords are omitted
Grammar symbols are reduced
Example Expression:
a + b * c
Syntax Tree:
+
/ \
a *
/ \
b c
---
Advantages of Syntax Trees
Compact representation
Easy code generation
Used in optimization
---
S-Attributed Definition
Uses only Synthesized Attributes.
Characteristics:
Easy implementation
Suitable for Bottom-Up Parsing
Example:
E → E1 + T
E.val = E1.val + T.val
---
L-Attributed Definition
Uses:
Synthesized Attributes
Limited Inherited Attributes
Characteristics:
Suitable for Top-Down Parsing
Widely used in compilers
---
Translation Schemes
A Translation Scheme is a CFG with semantic actions embedded within productions.
Example:
E → E + T { print('+') }
Semantic actions execute during parsing.
---
Intermediate Code Generation
Intermediate Code is generated between source code and machine code.
Compiler Structure:
Source Code
↓
Intermediate Code
↓
Machine Code
Benefits:
Machine independent
Easier optimization
Simplifies compiler design
---
Intermediate Forms of Source Programs
Common intermediate representations:
1. Syntax Tree
Tree representation.
Operator appears before operands.
Example:
+ A B
---
Operator appears after operands.
Example:
A B +
---
4. Three Address Code
Most important intermediate representation.
---
Three Address Code (TAC)
Each statement contains at most three addresses.
General Form:
x = y op z
Example:
a = b + c * d
Three Address Code:
t1 = c * d
t2 = b + t1
a = t2
---
Advantages of TAC
Easy optimization
Easy machine code generation
Simple representation
---
Types of Three Address Statements
Assignment
a = b
---
Unary Operation
t1 = -a
---
Binary Operation
t2 = a + b
---
Conditional Jump
if a < b goto L1
---
Unconditional Jump
goto L2
---
Representation of TAC
Three common representations:
---
Structure:
(op, arg1, arg2, result)
Example:
(+, a, b, t1)
---
Structure:
(op, arg1, arg2)
Example:
0: (+, a, b)
---
Indirect Triples
Uses pointers to triples.
Advantage:
Easier code optimization
---
Translation of Simple Statements
Expression:
x = a + b
TAC:
t1 = a + b
x = t1
---
Boolean expressions evaluate to:
TRUE
FALSE
Examples:
a > b
x == y
age >= 18
---
Intermediate Code for Boolean Expressions
Example:
if(a > b)
TAC:
if a > b goto L1
goto L2
---
Flow control determines program execution order.
Examples:
if
if-else
while
do-while
for
---
Translation of IF Statement
Source Code:
if(a>b)
x=y;
Three Address Code:
if a>b goto L1
goto L2
L1:
x=y
L2:
---
Translation of IF-ELSE Statement
Source Code:
if(a>b)
x=y;
else
x=z;
Three Address Code:
if a>b goto L1
goto L2
L1:
x=y
goto L3
L2:
x=z
L3:
---
Translation of WHILE Loop
Source Code:
while(a<b)
{
a=a+1;
}
Three Address Code:
L1:
if a<b goto L2
goto L3
L2:
a=a+1
goto L1
L3:
---
Translation of FOR Loop
Source Code:
for(i=1;i<=10;i++)
Intermediate representation:
i=1
L1:
if i<=10 goto L2
goto L3
L2:
statement
i=i+1
goto L1
L3:
---
Applications of Intermediate Code
Compiler construction
Code optimization
Program analysis
Machine code generation
---
Important Exam Questions
Short Questions
1. Define Syntax Directed Translation.
2. What is SDD?
3. What is a Syntax Tree?
4. Define TAC.
5. What are Synthesized Attributes?
6. What is Polish Notation?
7. Define Translation Scheme.
8. What is Quadruple Representation?
---
Long Questions
1. Explain Syntax Directed Definitions with examples.
2. Differentiate S-Attributed and L-Attributed Definitions.
3. Explain Syntax Trees and their applications.
4. Explain Three Address Code and its representations.
5. Generate TAC for arithmetic expressions.
6. Explain translation of Boolean expressions and control statements.
---
Quick Revision Notes
SDT = Syntax + Semantic Actions.
SDD = CFG + Attributes + Rules.
Syntax Tree = Compact program structure.
S-Attributed = Only Synthesized Attributes.
L-Attributed = Synthesized + Inherited Attributes.
TAC = Three Address Code.
Quadruple = (op, arg1, arg2, result).
Triple = (op, arg1, arg2).
Boolean Expressions = TRUE/FALSE.
TAC is the most important intermediate representation for exams.
---
Comments
Post a Comment