Unit 1: Introduction to Machine Learning

Subject: Machine Learning Techniques (MCA556)

From your Semester III syllabus.

---

What is Machine Learning?

Machine Learning (ML) is a branch of Artificial Intelligence (AI) that enables computers to learn from data and make decisions without being explicitly programmed.

Example

Netflix recommends movies.

YouTube recommends videos.

Gmail detects spam emails.

---

Basic Definitions

Data

Raw facts and figures.

Example:

Age = 20

Marks = 85

Dataset

Collection of data.

Example:

Age Marks

18 70

19 75

20 85

---

Learning

Learning means improving performance using experience (data).

Formula:

Experience + Data → Learning → Better Predictions

---

Types of Machine Learning

The syllabus covers several learning types.

1. Supervised Learning

Data contains inputs and correct outputs (labels).

Examples:

Predicting house prices

Predicting exam results

Algorithms:

Linear Regression

Decision Trees

SVM

---

2. Unsupervised Learning

Data has no labels.

Purpose:

Find hidden patterns

Group similar data

Examples:

Customer segmentation

Clustering

Algorithms:

K-Means

Hierarchical Clustering

---

3. Reinforcement Learning

Learning through rewards and penalties.

Example:

Self-driving cars

Game-playing AI

---

Hypothesis Space

A hypothesis is a possible solution/model.

Example: For predicting marks:

Marks = 5 × Study Hours + 30

All possible models together form the Hypothesis Space.

---

Inductive Bias

Assumptions made by a learning algorithm to generalize unseen data.

Example: Linear Regression assumes a linear relationship.

---

Evaluation of a Model

After training, we evaluate performance.

Questions:

Is the model accurate?

Can it predict correctly on new data?

---

Cross Validation

Used to test model reliability.

Most common:

K-Fold Cross Validation

Steps:

1. Split data into K parts.

2. Train on K−1 parts.

3. Test on remaining part.

4. Repeat K times.

5. Calculate average accuracy.

Benefits:

Better evaluation

Reduces overfitting

---

Linear Regression

One of the simplest ML algorithms.

Used for:

Predicting continuous values

Example:

House price prediction

Salary prediction

The model is represented by:

genui{"math_block_widget_always_prefetch_v2":{"content":"y=mx+b"}}Where:

y = predicted value

m = slope

b = intercept

---

Decision Trees

A tree-like model used for classification and prediction.

Example:

Study?

Yes

Pass

Fail

Advantages:

Easy to understand

Easy to visualize

---

Overfitting

Occurs when a model memorizes training data instead of learning patterns.

Symptoms

High training accuracy

Poor test accuracy

Example: Student memorizes answers but cannot solve new questions.

---

Learning System Design

Steps:

1. Collect Data

2. Preprocess Data

3. Select Features

4. Train Model

5. Evaluate Model

6. Deploy Model

---

Perspectives and Issues in ML

Common challenges:

Data Quality

Bad data → Bad predictions

Overfitting

Model learns noise

Underfitting

Model is too simple

Computational Cost

Large datasets need more resources

---

Ensemble Learning

Combines multiple models to improve performance.

Idea:

Many Weak Models

↓

Combined

↓

Strong Model

Examples:

Random Forest

Boosting

---

Applications of Machine Learning

Healthcare

Disease prediction

Banking

Fraud detection

Education

Student performance prediction

E-commerce

Product recommendations

Agriculture

Crop prediction

---

Feature Engineering

Process of selecting and transforming useful features.

Example:

Original Data:

Date: 13-06-2026

Feature Engineering:

Day = Saturday

Month = June

Year = 2026

Benefits:

Improves accuracy

Reduces complexity

---

Important Exam Questions

Short Questions

1. Define Machine Learning.

2. What is Supervised Learning?

3. What is Unsupervised Learning?

4. Define Reinforcement Learning.

5. What is Cross Validation?

6. What is Overfitting?

7. What is Feature Engineering?

8. Define Hypothesis Space.

---

Long Questions

1. Explain different types of Machine Learning.

2. Discuss Cross Validation with examples.

3. Explain Linear Regression.

4. Explain Decision Trees.

5. What is Overfitting? How can it be reduced?

6. Explain the design of a learning system.

---

Quick Revision

ML = Learning from data.

Supervised = Labeled data.

Unsupervised = Unlabeled data.

Reinforcement = Reward/Penalty.

Linear Regression = Prediction algorithm.

Decision Tree = Tree-based model.

Overfitting = Memorizing training data.

Cross Validation = Reliable testing.

Feature Engineering = Creating useful features.

Next: Unit 2

Evaluation Metrics (Precision, Recall, F1, MSE), K-Means Clustering, Bayes Learning, Gaussian Mixture Models, Feature Reduction. This unit is very important for both exams and ML interviews.

[ROOT@CYBERSHIELD]#

Unit 1: Introduction to Machine Learning

Discuss (0)