Lesson 7.10: Naive Bayes
🔹 What is Naive Bayes?
Naive Bayes is a probabilistic supervised learning algorithm based on Bayes’ Theorem.
-
Used for classification problems.
-
“Naive” → Assumes features are independent of each other.
Bayes’ Theorem:
P(A∣B)=P(B∣A)⋅P(A)P(B)P(A|B) = \frac{P(B|A) \cdot P(A)}{P(B)}
-
P(A∣B)P(A|B) → Probability of class A given feature B
-
P(B∣A)P(B|A) → Probability of feature B given class A
-
P(A)P(A) → Prior probability of class A
-
P(B)P(B) → Probability of feature B
🔹 Types of Naive Bayes
-
Gaussian NB → Assumes continuous features follow a normal distribution.
-
Multinomial NB → Used for discrete count features (e.g., text classification).
-
Bernoulli NB → Binary features (0 or 1).
🔹 Example
Text classification (spam detection):
-
X_train→ Features (e.g., word counts) -
y_train→ Class labels (spam or not spam)
🔹 Advantages
-
Simple, fast, and effective.
-
Works well with high-dimensional data.
-
Performs well in text classification.
🔹 Disadvantages
-
Assumes feature independence, which may not always hold.
-
Not suitable for correlated features.
✅ Quick Recap:
-
Naive Bayes → Probabilistic classifier using Bayes’ theorem.
-
Assumes feature independence and works well for text or categorical data.
