Lesson 4.2: Descriptive Statistics – Mean, Median, Mode, Variance, Std Dev
What are Descriptive Statistics?
Descriptive statistics summarize and describe the main features of a dataset. They give a quick understanding of data without going deep into modeling.
Key Measures:
-
Mean (Average)
-
Formula: Mean=Sum of all valuesNumber of values\text{Mean} = \frac{\text{Sum of all values}}{\text{Number of values}}
-
Example: For [10, 20, 30] → Mean = (10+20+30)/3 = 20.
-
-
Median (Middle Value)
-
The middle value when data is sorted.
-
Example: [10, 20, 30] → Median = 20.
-
For even numbers, average of two middle values.
-
-
Mode (Most Frequent Value)
-
The value that appears most often.
-
Example: [10, 20, 20, 30] → Mode = 20.
-
-
Variance
-
Measures how far data points are spread out from the mean.
-
High variance = more spread; low variance = less spread.
-
-
Standard Deviation (SD)
-
Square root of variance.
-
Tells how much values deviate from the mean.
-
Example: In exam scores, low SD = students scored similarly, high SD = scores varied widely.
-
Why Important?
-
Helps in understanding data distribution.
-
Used to detect outliers and variability.
-
Provides a foundation for further statistical analysis and ML models.
