Lesson 7.7: Decision Trees
🔹 What is a Decision Tree?
A Decision Tree is a supervised learning algorithm used for classification and regression.
-
Model splits data into branches based on feature values, forming a tree-like structure.
-
Each internal node → Feature decision
-
Each leaf node → Predicted outcome
🔹 How it Works
-
Start with root node (all data).
-
Choose best feature to split data using criteria like Gini, Entropy, or MSE.
-
Split data into subsets.
-
Repeat until stopping condition (max depth, min samples) is reached.
🔹 Example (Classification)
-
max_depth→ Controls overfitting -
X_train→ Feature data -
y_train→ Class labels
🔹 Advantages
-
Easy to visualize and interpret.
-
Can handle numerical and categorical data.
-
Non-linear relationships can be captured.
🔹 Disadvantages
-
Prone to overfitting if tree is too deep.
-
Sensitive to small changes in data.
✅ Quick Recap:
-
Decision Tree → Splits data into branches to make predictions.
-
Good for interpretability but watch for overfitting.
