Pandas DataFrame DataFrame.dropna() Function

Minahil Noor Jan 30, 2023

Pandas Pandas DataFrame

Syntax of pandas.DataFrame.dropna()
Example Codes: DataFrame.dropna() to Drop Row
Example Codes: DataFrame.dropna() to Drop Column
Example Codes: DataFrame.dropna() With how=all
Example Codes: DataFrame.dropna() With a Specified Subset or Thresh
Example Codes: DataFrame.dropna() With inplace=True

Pandas DataFrame DataFrame.dropna() Function

pandas.DataFrame.dropna() function removes null values (missing values) from the DataFrame by dropping the rows or columns containing the null values.

NaN (not a number) and NaT (Not a Time) represent the null values. DataFrame.dropna() detects these values and filters the DataFrame accordingly.

Syntax of `pandas.DataFrame.dropna()`

DataFrame.dropna(axis, how, thresh, subset, inplace)

Parameters


`axis`	It determines the axis to be either row or column. If it is 0 or `'index'`, then it drops the rows containing missing values. If it is 1 or `'columns'`, then it drops the columns containing the missing values. By default, its value is 0.
`how`	This parameter determines how the function drops rows or columns. It only accepts two `strings`, either `any` or `all`. By default, it’s set to `any`. `any` drops the row or column if there is any null value in it. `all` drops the row or column if all values are missing in it.
`thresh`	It is an integer that specifies the least number of non-missing values that prevent rows or columns from dropping.
`subset`	It is an array that has the names of rows or columns to specify the dropping procedure.
`inplace`	It is a Boolean value that changes the caller `DataFrame` if set to `True`. By default, its value is `False`.

Return

It returns a filtered DataFrame with dropped rows or columns according to the passed parameters.

Example Codes: `DataFrame.dropna()` to Drop Row

By default, the axis is 0 i.e rows, so all the outputs have rows dropped.

import pandas as pd

dataframe=pd.DataFrame({'Attendance': {0: 60, 1: None, 2: 80,3: None, 4: 95},
                    'Name': {0: 'Olivia', 1: 'John', 2: 'Laura',3: 'Ben',4: 'Kevin'},
                    'Obtained Marks': {0: None, 1: 75, 2: 82, 3: 64, 4: None}})
print(dataframe)

The example DataFrame is as follows.

   Attendance    Name  Obtained Marks
0        60.0  Olivia             NaN
1         NaN    John            75.0
2        80.0   Laura            82.0
3         NaN     Ben            64.0
4        95.0   Kevin             NaN

All the parameters of this function are optional. If we pass no parameter, then the function drops all the rows containing a single null value.

import pandas as pd

dataframe = pd.DataFrame(
    {
        "Attendance": {0: 60, 1: None, 2: 80, 3: None, 4: 95},
        "Name": {0: "Olivia", 1: "John", 2: "Laura", 3: "Ben", 4: "Kevin"},
        "Obtained Marks": {0: None, 1: 75, 2: 82, 3: 64, 4: None},
    }
)
dataframe1 = dataframe.dropna()
print(dataframe1)

Output:

   Attendance   Name  Obtained Marks
2        80.0  Laura            82.0

It has dropped all the rows that contained a single missing value.

Example Codes: `DataFrame.dropna()` to Drop Column

import pandas as pd

dataframe = pd.DataFrame(
    {
        "Attendance": {0: 60, 1: None, 2: 80, 3: None, 4: 95},
        "Name": {0: "Olivia", 1: "John", 2: "Laura", 3: "Ben", 4: "Kevin"},
        "Obtained Marks": {0: None, 1: 75, 2: 82, 3: 64, 4: None},
    }
)
dataframe1 = dataframe.dropna(axis=1)

print(dataframe1)

Output:

     Name
0  Olivia
1    John
2   Laura
3     Ben
4   Kevin

It has dropped all the columns that contained a single missing value because we set axis=1 in the DataFrame.dropna() method.

Example Codes: `DataFrame.dropna()` With `how=all`

import pandas as pd

dataframe = pd.DataFrame(
    {
        "Attendance": {0: 60, 1: None, 2: 80, 3: None, 4: 95},
        "Name": {0: "Olivia", 1: "John", 2: "Laura", 3: "Ben", 4: "Kevin"},
        "Obtained Marks": {0: None, 1: 75, 2: 82, 3: 64, 4: None},
    }
)

dataframe1 = dataframe.dropna(axis=1, how="all")
print(dataframe1)

Output:

   Attendance    Name  Obtained Marks
0        60.0  Olivia             NaN
1         NaN    John            75.0
2        80.0   Laura            82.0
3         NaN     Ben            64.0
4        95.0   Kevin             NaN

The rows containing the missing values are not dropped because the how parameter has value set to all which means that all the values of the row should be null.

If all the values are missing in the specified axis, then DataFrame.dropna() method drops that axis even when the how is set to be all.

import pandas as pd

dataframe = pd.DataFrame(
    {
        "Attendance": {0: 60, 1: None, 2: 80, 3: None, 4: 95},
        "Name": {0: "Olivia", 1: "John", 2: "Laura", 3: "Ben", 4: "Kevin"},
        "Obtained Marks": {0: None, 1: None, 2: None, 3: None, 4: None},
    }
)

print(dataframe)
print("--------")
dataframe1 = dataframe.dropna(axis=1, how="all")
print(dataframe1)

Output:

   Attendance    Name Obtained Marks
0        60.0  Olivia           None
1         NaN    John           None
2        80.0   Laura           None
3         NaN     Ben           None
4        95.0   Kevin           None
--------
   Attendance    Name
0        60.0  Olivia
1         NaN    John
2        80.0   Laura
3         NaN     Ben
4        95.0   Kevin

Example Codes: `DataFrame.dropna()` With a Specified Subset or Thresh

import pandas as pd

dataframe = pd.DataFrame(
    {
        "Attendance": {0: 60, 1: None, 2: 80, 3: None, 4: 95},
        "Name": {0: "Olivia", 1: "John", 2: "Laura", 3: "Ben", 4: "Kevin"},
        "Obtained Marks": {0: None, 1: 75, 2: 82, 3: 64, 4: None},
    }
)

dataframe1 = dataframe.dropna(thresh=3)
print(dataframe1)

Output:

   Attendance   Name  Obtained Marks
2        80.0  Laura            82.0

The value of thresh is 3 which means that to prevent dropping, at least 3 non-empty values are required.

We could also specify the subset.

import pandas as pd

dataframe = pd.DataFrame(
    {
        "Attendance": {0: 60, 1: None, 2: 80, 3: None, 4: 95},
        "Name": {0: "Olivia", 1: "John", 2: "Laura", 3: "Ben", 4: "Kevin"},
        "Obtained Marks": {0: None, 1: 75, 2: 82, 3: 64, 4: None},
    }
)

dataframe1 = dataframe.dropna(subset=["Attendance", "Name"])
print(dataframe1)

Output:

   Attendance    Name  Obtained Marks
0        60.0  Olivia             NaN
2        80.0   Laura            82.0
4        95.0   Kevin             NaN

It drops rows with missing values on the basis of Attendance and Name column. It doesn’t drop rows if only the values in other columns, Obtained Marks here, have missing values.

Example Codes: `DataFrame.dropna()` With `inplace=True`

DataFrame.dropna() changes the caller DataFrame in-place if inplace is set to True.

import pandas as pd

dataframe = pd.DataFrame(
    {
        "Attendance": {0: 60, 1: None, 2: 80, 3: None, 4: 95},
        "Name": {0: "Olivia", 1: "John", 2: "Laura", 3: "Ben", 4: "Kevin"},
        "Obtained Marks": {0: None, 1: 75, 2: 82, 3: 64, 4: None},
    }
)
dataframe1 = dataframe.dropna(inplace=True)
print(dataframe1)

Output:

None

The parameter has modified the caller DataFrame in-place and returned None.

Syntax of pandas.DataFrame.dropna()

Parameters

Return

Example Codes: DataFrame.dropna() to Drop Row

Example Codes: DataFrame.dropna() to Drop Column

Example Codes: DataFrame.dropna() With how=all

Example Codes: DataFrame.dropna() With a Specified Subset or Thresh

Example Codes: DataFrame.dropna() With inplace=True

Related Article - Pandas DataFrame

Syntax of `pandas.DataFrame.dropna()`

Example Codes: `DataFrame.dropna()` to Drop Row

Example Codes: `DataFrame.dropna()` to Drop Column

Example Codes: `DataFrame.dropna()` With `how=all`

Example Codes: `DataFrame.dropna()` With a Specified Subset or Thresh

Example Codes: `DataFrame.dropna()` With `inplace=True`