Lesson 9.5: Ensemble Learning – Bagging, Boosting (AdaBoost, XGBoost, LightGBM)
🔹 What is Ensemble Learning?
Ensemble Learning combines multiple models to create a stronger, more accurate model.
-
Reduces overfitting and improves generalization.
-
Two main types: Bagging and Boosting.
🔹 Bagging (Bootstrap Aggregating)
-
Builds multiple models in parallel using random subsets of data.
-
Predictions are aggregated (majority vote for classification, average for regression).
-
Example: Random Forest
Advantages:
-
Reduces variance, prevents overfitting.
Disadvantages:
-
Models are independent, may not correct biases.
🔹 Boosting
-
Builds models sequentially, where each new model corrects errors of the previous one.
-
Examples: AdaBoost, XGBoost, LightGBM
-
AdaBoost – Assigns higher weights to misclassified points.
-
XGBoost – Optimized, regularized boosting algorithm → fast & accurate.
-
LightGBM – Gradient boosting for large datasets, faster and memory efficient.
Advantages:
-
Reduces bias and variance.
-
Often produces state-of-the-art performance.
Disadvantages:
-
Computationally more expensive than bagging.
-
Sensitive to noisy data and outliers.
🔹 Example (Using AdaBoost)
✅ Quick Recap:
-
Ensemble Learning → Combines multiple models for better performance.
-
Bagging → Reduces variance, parallel models.
-
Boosting → Reduces bias, sequential models.
-
Popular Boosting → AdaBoost, XGBoost, LightGBM
