Lesson 6.3: Bias vs Variance – Underfitting & Overfitting
🔹 Key Concepts
-
Bias
-
Error due to oversimplified model assumptions.
-
High bias → Model cannot capture data patterns → Underfitting.
-
Example: Predicting house prices using only the number of bedrooms, ignoring location, size, and age.
-
Variance
-
Error due to model being too sensitive to training data.
-
High variance → Model captures noise → Overfitting.
-
Example: Model perfectly predicts training data but performs poorly on new data.
🔹 Underfitting
-
Occurs when the model is too simple.
-
Cannot learn underlying patterns.
-
Signs: Low accuracy on training and test data.
-
Solution: Use a more complex model or add features.
🔹 Overfitting
-
Occurs when the model is too complex.
-
Learns noise in training data.
-
Signs: High accuracy on training, low on test data.
-
Solution:
-
Regularization (L1, L2)
-
Pruning trees
-
More training data
-
Cross-validation
-
🔹 Bias-Variance Tradeoff
-
Goal → Balance bias and variance to minimize total error.
-
Optimal model → Neither underfits nor overfits.
✅ Quick Recap:
-
High Bias → Underfitting → Simple model
-
High Variance → Overfitting → Complex model
-
Tradeoff → Find the sweet spot for best performance
