Compiler Design (MCA554) Unit 1: Introduction to Compilers & Lexical Analysis

Unit 1: Introduction to Compilers & Lexical Analysis

From your MCA Semester III syllabus. 

---


What is a Compiler?

A Compiler is a software that translates a program written in a high-level language into machine language.

Example:

printf("Hello");

Compiler converts it into machine code that the computer understands.

---

Language Translators

1. Compiler

Converts the entire program at once.

Source Program

      ↓

   Compiler

      ↓

Object Code

Advantages:

Faster execution

Errors shown after compilation

Examples:

C

C++

Go

---

2. Interpreter

Translates line by line.

Source Program

      ↓

 Interpreter

      ↓

 Execute Line by Line


Advantages:

Easy debugging

Disadvantages:

Slower execution

Examples:

Python

JavaScript

---

3. Assembler

Converts Assembly Language to Machine Language.

Assembly Code

      ↓

   Assembler

      ↓

Machine Code

---


Difference Between Compiler and Interpreter

Compiler Interpreter

Entire program compiled Line by line

Faster execution Slower

Generates object code No object code

Errors after compilation Errors immediately

---

Structure of a Compiler

A compiler consists of several phases.

Source Program

      ↓

Lexical Analysis

      ↓

Syntax Analysis

      ↓

Semantic Analysis

      ↓

Intermediate Code Generation

      ↓

Code Optimization

      ↓

Code Generation

      ↓

Target Program

---


Phases of Compiler

1. Lexical Analysis

First phase.

Tasks:

Read source code

Remove spaces/comments

Generate tokens

Example:

int a=10;

Tokens:

int

a=10;

---

2. Syntax Analysis

Checks grammatical correctness.

Uses:

Parse Trees

Context Free Grammar

Example:

a = + 5

Syntax Error

---

3. Semantic Analysis

Checks meaning.

Example:

int x;

x = "Hello";

Semantic Error

---

4. Intermediate Code Generation

Creates intermediate representation.

Example:

a = b + c

becomes

t1 = b + c

a = t1

---

5. Code Optimization

Improves efficiency.

Example:

a = 5 + 3

Optimized to:

a = 8

---


6. Code Generation

Produces machine code.

---


Front End and Back End of Compiler

Front End

Handles:

Lexical Analysis

Syntax Analysis

Semantic Analysis

Purpose:

Understand source code

---


Back End

Handles:

Optimization

Code Generation

Purpose:

Produce efficient machine code

-----

Linker

Combines object files.

Example:

main.obj

math.obj

file.obj

    ↓

  Linker

    ↓

program.exe

---


Loader

Loads executable program into memory.

Program.exe

     ↓

   Loader

     ↓

 Memory

---

Types of Compiler

Single Pass Compiler

Only one scan.

Advantage:

Fast

---

Multi Pass Compiler

Multiple scans.

Advantage:

Better optimization

---

Cross Compiler

Runs on one machine but generates code for another.

Example:

Windows → ARM Processor

---

Lexical Analyzer

Also called:

Scanner

First phase of compiler.


Responsibilities:

Read characters

Form tokens

Remove comments

Ignore spaces

---

Token

Smallest meaningful unit.

Example:

int age=20;

Tokens:

int age = 20;

---


Lexeme

Actual sequence of characters.

Example:

count

Here:

Lexeme = count

---

Pattern

Rule describing a token.

Example:

[a-zA-Z][a-zA-Z0-9]*

Represents identifiers.


---

Recognition of Tokens

Lexical Analyzer identifies:

Keywords

Identifiers

Operators

Constants

Delimiters

---


Examples

Keywords

int

float

while

if

return

---

Identifiers

count

salary

marks

---

Operators

+

-

*

/

=

---

Delimiters

;

,

()

{}


---

Input Buffering

Improves reading efficiency.

Without buffering:

Character by character reading

With buffering:

Reads blocks of data

Benefits:

Faster compilation

---

Issues in Lexical Analysis

1. Removing comments

2. Handling white spaces

3. Detecting invalid tokens

4. Efficient scanning


---

Lexical Analyzer Generator

Tool used to generate lexical analyzers automatically.

Example:

LEX

Input:

Token Rules

Output:

Lexical Analyzer Program

---

Important Exam Questions

Short Questions

1. Define Compiler.

2. Define Token.

3. Define Lexeme.

4. What is a Loader?

5. What is a Linker?

6. What is Input Buffering?

7. What is Lexical Analysis?

8. What is LEX?

---------------

Long Questions

1. Explain structure and phases of a compiler.

2. Differentiate Compiler and Interpreter.

3. Explain Lexical Analysis with examples.

4. Explain Front End and Back End of Compiler.

5. Discuss Tokens, Lexemes and Patterns.

6. Explain Linker and Loader.

--------------

Quick Revision

Compiler = High-level → Machine code.

Interpreter = Line-by-line translation.

Assembler = Assembly → Machine code.

Lexical Analysis = First phase.

Token = Smallest meaningful unit.

Lexeme = Actual text.

Pattern = Rule for token.

Linker = Combines object files.

Loader = Loads program into memory.

LEX = Lexical analyzer generator.



Comments

Popular posts from this blog

Raster scan Vs Vector Scan

MCA SYLLABUS ALLAHABAD UNIVERSITY 2025

📘 PAPER 4 – DESIGN & ANALYSIS OF ALGORITHMS (UNIT 1 – INTRODUCTION & SORTING TECHNIQUES ) university of allahabad