What are Percentiles?
In statistics, percentiles help us understand how a data set is distributed. A percentile tells us the value below which a certain percentage of the data falls. For example, the 75th percentile tells us the value below which 75% of the data points are located.
Key Concept:
Percentiles divide a data set into 100 equal parts. For example, the 25th percentile means that 25% of the data points fall below that value.
Example: Ages of People in a Barangay
Imagine we have the ages of people living on a street in a barangay (neighborhood) in the Philippines. We want to find the 75th percentile, meaning that 75% of the people are younger than a certain age.
ages = [5, 31, 43, 48, 50, 41, 7, 11, 15, 39, 80, 82, 32, 2, 8, 6, 25, 36, 27, 61, 31]
Step 1: Finding the 75th Percentile
The 75th percentile means that 75% of the people are younger than or equal to a certain age. In this case, the 75th percentile value is 43. This means that 75% of the people in this group are 43 years old or younger.
To calculate the 75th percentile in Python, we use the NumPy module:
import numpy ages = [5, 31, 43, 48, 50, 41, 7, 11, 15, 39, 80, 82, 32, 2, 8, 6, 25, 36, 27, 61, 31] x = numpy.percentile(ages, 75) print(x)
The output will be 43 meaning 75% of the people are younger than or equal to 43 years old.
Step 2: Finding the 90th Percentile
Let’s say we want to find the 90th percentile, which tells us the age below which 90% of the people are younger. In this case, the 90th percentile value is 61, meaning that 90% of the people are younger than or equal to 61.
To calculate the 90th percentile in Python, we again use the NumPy module:
import numpy ages = [5, 31, 43, 48, 50, 41, 7, 11, 15, 39, 80, 82, 32, 2, 8, 6, 25, 36, 27, 61, 31] x = numpy.percentile(ages, 90) print(x)
These calculations are helpful in understanding the distribution of data. For example, you can use percentiles to understand income levels in different areas of the Philippines, or to analyze how fast a jeepney or bus completes a route compared to other vehicles. By calculating percentiles, we can identify trends and make data-driven decisions.