Lesson 5.1: Basics of Statistics – Population vs Sample
Statistics is the science of collecting, analyzing, and interpreting data.
In data science, understanding population and sample is very important.
🔹 Population
-
A population is the entire set of items, people, or data you are studying.
-
Example: Marks of all students in India.
-
Usually too large to study completely.
🔹 Sample
-
A sample is a small part of the population that represents the whole.
-
Example: Marks of 500 students from India taken for analysis.
-
Used because it is practical, faster, and cheaper than studying the full population.
🔹 Key Points
-
Population size = Large (N)
-
Sample size = Small (n)
-
A good sample must be random and unbiased.
-
Data analysis is usually done on samples, and results are generalized to the population.
✅ Example:
If a company wants to know the average salary of employees in India, it’s impossible to ask everyone. Instead, they take a sample survey and then estimate the average for the whole population.
