# Python Pandas Percentile

In Python Pandas, there are some inbuilt functions such as `mean()`, `mode()`, `median()`, and `quantile()` function.

Developers consider this function a crucial step in the data cleaning process.

## Calculate Pandas Percentile in Python

A luxurious car company wants to develop its branches, for that it is trying to locate the best location thru getting the people’s income.

``````import pandas as pd
import numpy as np

df
``````

Output:

``````# A Dataframe with `Name` and `Monthly Income (in INR)` as columns present in the Record.xlsx file. These columns determine the income of people present in the respective locality.
index	Name		Monthly Income (in INR)
0	John		30000
1	Joel		35000
2	Albert		45000
3	Ali		50000
4	Jerin		70000
5	Robert		90000
6	Sampath		1000000
``````

We have imported the dataset into a dataframe using pandas. Let us filter the data frame with our requirements.

The company needs the average income of that locality, affecting its growth.

``````df.describe()
``````

Output:

``````index	Monthly Income (in INR)
count	7.000000
mean	188571.428571
std	358407.522774
min	30000.000000
25%	40000.000000
50%	50000.000000
75%	80000.000000
max	1000000.000000
``````

We can infer that the mean is higher than the average income of the people. We use the `percentile` concept to deal with these types of real-time situations.

``````percentile = df["Monthly Income (in INR)"].quantile(0.99)
percentile
``````

Output:

``````945399.9999999995
``````
``````new_df = df[df["Monthly Income (in INR)"] <= percentile]
new_df
``````

Output:

``````index	Name		Monthly Income (in INR)
0	John		30000
1	Joel		35000
2	Albert		45000
3	Ali		50000
4	Jerin		70000
5	Robert		90000
``````

We have found one solution, and now we have to fill the null values in a dataframe, mostly average values.

``````df
``````

Output:

``````index	Name		Monthly Income (in INR)
0	John		30000.0
1	Joel		35000.0
2	Albert		45000.0
3	Ali		NaN
4	Jerin		70000.0
5	Robert		90000.0
6	Sampath		1000000.0
``````
``````avg = df["Monthly Income (in INR)"].mean()
df.fillna(avg)
``````

Output:

``````index	Name		Monthly Income (in INR)
0	John		30000.000000
1	Joel		35000.000000
2	Albert		45000.000000
3	Ali		211666.666667
4	Jerin		70000.000000
5	Robert		90000.000000
6	Sampath		1000000.000000
``````

We can see that the null value is filled with the `average incomes`, but it is not ideal. To overcome this problem, we use another method.

``````med = df["Monthly Income (in INR)"].median()
df.fillna(med)
``````

Output:

``````index	Name		Monthly Income (in INR)
0	John		30000.0
1	Joel		35000.0
2	Albert		45000.0
3	Ali		57500.0
4	Jerin		70000.0
5	Robert		90000.0
6	Sampath		1000000.0
``````

In this way, we can determine the proper value for the company based on its growth.