Lesson 4.3: Data Visualization – Histogram, Scatter Plot, Box Plot, Heatmaps
What is Data Visualization?
Data visualization is the process of representing data in graphs and charts so that patterns, trends, and insights can be understood easily.
1. Histogram
-
Shows the frequency distribution of numerical data.
-
X-axis = ranges (bins), Y-axis = frequency.
-
Example: A histogram of student scores shows how many students scored within each range (0–10, 10–20, etc.).
2. Scatter Plot
-
Plots data points on X-Y axis to show relationships between two variables.
-
Example: Plotting “Study Hours” vs. “Exam Marks” shows whether more study leads to higher marks.
3. Box Plot (Whisker Plot)
-
Summarizes data distribution with:
-
Minimum, Q1 (25%), Median, Q3 (75%), Maximum.
-
Detects outliers.
-
-
Example: Salaries in a company → box plot shows median salary and extreme high/low earners.
4. Heatmap
-
A color-coded matrix showing the strength of relationships.
-
Example: Correlation heatmap between features (e.g., height vs. weight vs. age).
-
Darker/brighter colors = stronger relationship.
Why Visualization is Important?
-
Helps in understanding data quickly.
-
Detects patterns, trends, and outliers.
-
Makes communication of findings easier.
