Subject: Machine Learning Techniques (MCA556)
From your Semester III syllabus.
---
What is Machine Learning?
Machine Learning (ML) is a branch of Artificial Intelligence (AI) that enables computers to learn from data and make decisions without being explicitly programmed.
Example
Netflix recommends movies.
YouTube recommends videos.
Gmail detects spam emails.
---
Basic Definitions
Data
Raw facts and figures.
Example:
Age = 20
Marks = 85
Dataset
Collection of data.
Example:
Age Marks
18 70
19 75
20 85
---
Learning
Learning means improving performance using experience (data).
Formula:
Experience + Data → Learning → Better Predictions
---
Types of Machine Learning
The syllabus covers several learning types.
1. Supervised Learning
Data contains inputs and correct outputs (labels).
Examples:
Predicting house prices
Predicting exam results
Algorithms:
Linear Regression
Decision Trees
SVM
---
2. Unsupervised Learning
Data has no labels.
Purpose:
Find hidden patterns
Group similar data
Examples:
Customer segmentation
Clustering
Algorithms:
K-Means
Hierarchical Clustering
---
3. Reinforcement Learning
Learning through rewards and penalties.
Example:
Self-driving cars
Game-playing AI
---
Hypothesis Space
A hypothesis is a possible solution/model.
Example: For predicting marks:
Marks = 5 × Study Hours + 30
All possible models together form the Hypothesis Space.
---
Inductive Bias
Assumptions made by a learning algorithm to generalize unseen data.
Example: Linear Regression assumes a linear relationship.
---
Evaluation of a Model
After training, we evaluate performance.
Questions:
Is the model accurate?
Can it predict correctly on new data?
---
Cross Validation
Used to test model reliability.
Most common:
K-Fold Cross Validation
Steps:
1. Split data into K parts.
2. Train on K−1 parts.
3. Test on remaining part.
4. Repeat K times.
5. Calculate average accuracy.
Benefits:
Better evaluation
Reduces overfitting
---
Linear Regression
One of the simplest ML algorithms.
Used for:
Predicting continuous values
Example:
House price prediction
Salary prediction
The model is represented by:
genui{"math_block_widget_always_prefetch_v2":{"content":"y=mx+b"}}Where:
y = predicted value
m = slope
b = intercept
---
Decision Trees
A tree-like model used for classification and prediction.
Example:
Study?
|
Yes
|
Pass
No
|
Fail
Advantages:
Easy to understand
Easy to visualize
---
Overfitting
Occurs when a model memorizes training data instead of learning patterns.
Symptoms
High training accuracy
Poor test accuracy
Example: Student memorizes answers but cannot solve new questions.
---
Learning System Design
Steps:
1. Collect Data
2. Preprocess Data
3. Select Features
4. Train Model
5. Evaluate Model
6. Deploy Model
---
Perspectives and Issues in ML
Common challenges:
Data Quality
Bad data → Bad predictions
Overfitting
Model learns noise
Underfitting
Model is too simple
Computational Cost
Large datasets need more resources
---
Ensemble Learning
Combines multiple models to improve performance.
Idea:
Many Weak Models
↓
Combined
↓
Strong Model
Examples:
Random Forest
Boosting
---
Applications of Machine Learning
Healthcare
Disease prediction
Banking
Fraud detection
Education
Student performance prediction
E-commerce
Product recommendations
Agriculture
Crop prediction
---
Feature Engineering
Process of selecting and transforming useful features.
Example:
Original Data:
Date: 13-06-2026
Feature Engineering:
Day = Saturday
Month = June
Year = 2026
Benefits:
Improves accuracy
Reduces complexity
---
Important Exam Questions
Short Questions
1. Define Machine Learning.
2. What is Supervised Learning?
3. What is Unsupervised Learning?
4. Define Reinforcement Learning.
5. What is Cross Validation?
6. What is Overfitting?
7. What is Feature Engineering?
8. Define Hypothesis Space.
---
Long Questions
1. Explain different types of Machine Learning.
2. Discuss Cross Validation with examples.
3. Explain Linear Regression.
4. Explain Decision Trees.
5. What is Overfitting? How can it be reduced?
6. Explain the design of a learning system.
---
Quick Revision
ML = Learning from data.
Supervised = Labeled data.
Unsupervised = Unlabeled data.
Reinforcement = Reward/Penalty.
Linear Regression = Prediction algorithm.
Decision Tree = Tree-based model.
Overfitting = Memorizing training data.
Cross Validation = Reliable testing.
Feature Engineering = Creating useful features.
Next: Unit 2
Evaluation Metrics (Precision, Recall, F1, MSE), K-Means Clustering, Bayes Learning, Gaussian Mixture Models, Feature Reduction. This unit is very important for both exams and ML interviews.