Compiler Design (MCA554) Unit 1: Introduction to Compilers & Lexical Analysis
Unit 1: Introduction to Compilers & Lexical Analysis
From your MCA Semester III syllabus.
---
What is a Compiler?
A Compiler is a software that translates a program written in a high-level language into machine language.
Example:
printf("Hello");
Compiler converts it into machine code that the computer understands.
---
Language Translators
1. Compiler
Converts the entire program at once.
Source Program
↓
Compiler
↓
Object Code
Advantages:
Faster execution
Errors shown after compilation
Examples:
C
C++
Go
---
2. Interpreter
Translates line by line.
Source Program
↓
Interpreter
↓
Execute Line by Line
Advantages:
Easy debugging
Disadvantages:
Slower execution
Examples:
Python
JavaScript
---
3. Assembler
Converts Assembly Language to Machine Language.
Assembly Code
↓
Assembler
↓
Machine Code
---
Difference Between Compiler and Interpreter
Compiler Interpreter
Entire program compiled Line by line
Faster execution Slower
Generates object code No object code
Errors after compilation Errors immediately
---
Structure of a Compiler
A compiler consists of several phases.
Source Program
↓
Lexical Analysis
↓
Syntax Analysis
↓
Semantic Analysis
↓
Intermediate Code Generation
↓
Code Optimization
↓
Code Generation
↓
Target Program
---
Phases of Compiler
1. Lexical Analysis
First phase.
Tasks:
Read source code
Remove spaces/comments
Generate tokens
Example:
int a=10;
Tokens:
int
a=10;
---
2. Syntax Analysis
Checks grammatical correctness.
Uses:
Parse Trees
Context Free Grammar
Example:
a = + 5
Syntax Error
---
3. Semantic Analysis
Checks meaning.
Example:
int x;
x = "Hello";
Semantic Error
---
4. Intermediate Code Generation
Creates intermediate representation.
Example:
a = b + c
becomes
t1 = b + c
a = t1
---
5. Code Optimization
Improves efficiency.
Example:
a = 5 + 3
Optimized to:
a = 8
---
6. Code Generation
Produces machine code.
---
Front End and Back End of Compiler
Front End
Handles:
Lexical Analysis
Syntax Analysis
Semantic Analysis
Purpose:
Understand source code
---
Back End
Handles:
Optimization
Code Generation
Purpose:
Produce efficient machine code
-----
Linker
Combines object files.
Example:
main.obj
math.obj
file.obj
↓
Linker
↓
program.exe
---
Loader
Loads executable program into memory.
Program.exe
↓
Loader
↓
Memory
---
Types of Compiler
Single Pass Compiler
Only one scan.
Advantage:
Fast
---
Multi Pass Compiler
Multiple scans.
Advantage:
Better optimization
---
Cross Compiler
Runs on one machine but generates code for another.
Example:
Windows → ARM Processor
---
Lexical Analyzer
Also called:
Scanner
First phase of compiler.
Responsibilities:
Read characters
Form tokens
Remove comments
Ignore spaces
---
Token
Smallest meaningful unit.
Example:
int age=20;
Tokens:
int age = 20;
---
Lexeme
Actual sequence of characters.
Example:
count
Here:
Lexeme = count
---
Pattern
Rule describing a token.
Example:
[a-zA-Z][a-zA-Z0-9]*
Represents identifiers.
---
Recognition of Tokens
Lexical Analyzer identifies:
Keywords
Identifiers
Operators
Constants
Delimiters
---
Examples
Keywords
int
float
while
if
return
---
Identifiers
count
salary
marks
---
Operators
+
-
*
/
=
---
Delimiters
;
,
()
{}
---
Input Buffering
Improves reading efficiency.
Without buffering:
Character by character reading
With buffering:
Reads blocks of data
Benefits:
Faster compilation
---
Issues in Lexical Analysis
1. Removing comments
2. Handling white spaces
3. Detecting invalid tokens
4. Efficient scanning
---
Lexical Analyzer Generator
Tool used to generate lexical analyzers automatically.
Example:
LEX
Input:
Token Rules
Output:
Lexical Analyzer Program
---
Important Exam Questions
Short Questions
1. Define Compiler.
2. Define Token.
3. Define Lexeme.
4. What is a Loader?
5. What is a Linker?
6. What is Input Buffering?
7. What is Lexical Analysis?
8. What is LEX?
---------------
Long Questions
1. Explain structure and phases of a compiler.
2. Differentiate Compiler and Interpreter.
3. Explain Lexical Analysis with examples.
4. Explain Front End and Back End of Compiler.
5. Discuss Tokens, Lexemes and Patterns.
6. Explain Linker and Loader.
--------------
Quick Revision
Compiler = High-level → Machine code.
Interpreter = Line-by-line translation.
Assembler = Assembly → Machine code.
Lexical Analysis = First phase.
Token = Smallest meaningful unit.
Lexeme = Actual text.
Pattern = Rule for token.
Linker = Combines object files.
Loader = Loads program into memory.
LEX = Lexical analyzer generator.
Comments
Post a Comment