Lesson 11.3: Project 2 – Titanic Survival Prediction (Classification)
🔹 Objective
Predict whether a passenger survived the Titanic disaster using features like age, sex, class, and fare.
-
Practice data cleaning, feature engineering, classification modeling, and evaluation.
🔹 Steps to Build the Project
-
Load Dataset
-
Understand Dataset
-
Check columns, missing values, data types.
-
Preprocess Data
-
Handle missing values → Fill Age, drop Cabin if too many missing.
-
Encode categorical variables → Sex, Embarked using One-Hot Encoding.
-
Feature scaling → Not always needed for tree-based models.
-
Split Dataset
-
Build Classification Model
-
Example: Logistic Regression
-
Evaluate Model
-
Metrics: Accuracy, Precision, Recall, F1-score, Confusion Matrix
-
Optional Improvements
-
Try Random Forest, XGBoost, or SVM for better performance.
-
Feature engineering: create FamilySize, Title from names.
🔹 Key Learnings
-
Classification predicts categorical outcomes.
-
Feature engineering improves model accuracy.
-
Multiple evaluation metrics help understand model performance.
✅ Quick Recap:
-
Task → Predict Titanic survival (classification).
-
Steps → Load → Clean → Encode → Split → Train → Evaluate.
-
Improve → Try advanced classifiers, feature engineering.
