How to Check if a Cell Is Empty in Pandas

Olorunfemi Akinlua Feb 02, 2024
  1. Use the isnull() Function to Check if the Cell Is Empty
  2. Use the isna() Function to Check if the Cell Is Empty
  3. Use the any() Function to Check if the Cell Is Empty
  4. Use NumPy isnan to Check if the Cell Is Empty
  5. Conclusion
How to Check if a Cell Is Empty in Pandas

Empty data in a dataset can hinder meaningful analysis and operations, making it crucial to identify and handle such instances effectively.

When we load our CSV data into Pandas as a dataframe, each piece of data is present within a cell, and every empty cell represents empty data. In this article, we will explore the methods to determine if a cell within a Pandas DataFrame is empty, equipping you with essential techniques to manage missing or null values in your data.

Use the isnull() Function to Check if the Cell Is Empty

To showcase the functions in action, we will first create a Pandas dataframe with some empty cells.

import pandas as pd

df = pd.DataFrame([[1, 2, 3], [3, 4], [3, 4, 5, 6]], columns=list("ABCD"))
print(df)

Output:

A  B    C    D
0  1  2  3.0  NaN
1  3  4  NaN  NaN
2  3  4  5.0  6.0

Now that we have a dataframe with 3 empty cells, we can play with the isnull() function, which is designed to find missing values for array-like object - NaN, None, or NaT - and returns a Boolean value that indicates whether or not a missing value is present.

You can apply the isnull() function on the entire dataframe, a specific column, or a specific cell.

To check across the dataframe or a specific column, we will pass the dataframe as the argument.

import pandas as pd

df = pd.DataFrame([[1, 2, 3], [3, 4], [3, 4, 5, 6]], columns=list("ABCD"))

empty_cells_df = df.isnull()
print("Empty cells in the DataFrame:\n", empty_cells_df)

empty_cells_C = df["C"].isnull()
print("\nEmpty cells in column 'C':\n", empty_cells_C)

Output:

Empty cells in the DataFrame:
A      B      C      D
0  False  False  False   True
1  False  False   True   True
2  False  False  False  False

Empty cells in column 'C':
0    False
1     True
2    False
Name: C, dtype: bool

The output shows a dataframe and a column with Boolean values that indicate whether there is an empty value or not, where False means a non-empty value and True means an empty value.

If you’re more concerned with checking if a specific cell is empty, you can use either the loc or iloc method in conjunction with the isnull() method.

Here, we want to check the cell at index 1 in column C. The following code will be appropriate to select and make the check.

import pandas as pd

df = pd.DataFrame([[1, 2, 3], [3, 4], [3, 4, 5, 6]], columns=list("ABCD"))

row_index = 1
column_name = "C"
specific_cell_loc = df.loc[row_index, column_name]
specific_cell_iloc = df[column_name].iloc[row_index]

is_empty_loc = pd.isnull(specific_cell_loc)
is_empty_iloc = pd.isnull(specific_cell_iloc)

print(f"Is the cell at row {row_index}, column '{column_name}' empty? {is_empty_loc}")
print(f"Is the cell at row {row_index}, column '{column_name}' empty? {is_empty_iloc}")

Output:

Is the cell at row 1, column 'C' empty? True
Is the cell at row 1, column 'C' empty? True

In this example, we accessed a specific cell in the DataFrame using df.loc[row_index, column_name] and df[column_name].iloc[row_index]. Then, we used pd.isnull() to check if the cell is empty.

The result is a Boolean value (True if the cell is empty, False otherwise), which we print to the console.

Use the isna() Function to Check if the Cell Is Empty

Similar to isnull(), the isna() function in Pandas is used to identify missing or null values within a DataFrame. The primary difference between the two is that isna() is an alias for isnull(), meaning they are essentially the same function with different names.

You can choose to use either based on your preference. To check if a cell in a Pandas DataFrame is empty, we can use isna() in different ways.

Let’s walk through some common methods using both functions.

To check for empty cells in a specific column, you can use isna() on that column. This will return a Boolean Series indicating whether each cell in the column is empty or not.

import pandas as pd

df = pd.DataFrame([[1, 2, 3], [3, 4], [3, 4, 5, 6]], columns=list("ABCD"))

empty_cells_D = df["D"].isna()
print("\nEmpty cells in column 'D':\n", empty_cells_D)

Output:

Empty cells in column 'D':
0     True
1     True
2    False
Name: D, dtype: bool

After creating the dataframe, we check for empty cells in column D by using the isna() function on the D column of the DataFrame. The isna() function returned a Boolean Series, where True indicates the cell is empty and False indicates it is not.

The resulting Boolean Series is stored in the variable empty_cells_D, and then it is printed to the console.

To check for empty cells in the entire DataFrame, use isna() on the DataFrame itself.

import pandas as pd

df = pd.DataFrame([[1, 2, 3], [3, 4], [3, 4, 5, 6]], columns=list("ABCD"))

empty_cells_df = df.isna()
print("Empty cells in the DataFrame:\n", empty_cells_df)

Output:

Empty cells in the DataFrame:
A      B      C      D
0  False  False  False   True
1  False  False   True   True
2  False  False  False  False

As we can see, this returned a DataFrame of Boolean values, indicating whether each cell is empty or not.

Now, if you want to count the number of empty cells in a specific column, you can chain sum() after using isna() or isnull() on that column. This will count the True values, which correspond to empty cells.

import pandas as pd

# Create a sample DataFrame
df = pd.DataFrame([[1, 2, 3], [3, 4], [3, 4, 5, 6]], columns=list("ABCD"))

# Count empty cells in column 'A'
empty_cells_count_D = df["D"].isna().sum()
print("Number of empty cells in column 'D':", empty_cells_count_D)

Output:

Number of empty cells in column 'D': 2

Here, the code counts the number of empty cells in column D of the DataFrame df.

  • df['D'].isna() uses isna() to generate a Boolean Series indicating whether each cell in column D is empty (True) or not (False).
  • Then, .sum() calculates the sum of True values in the Boolean Series, effectively counting the number of empty cells in column D.

Finally, the resulting count is stored in the variable empty_cells_count_D and is printed.

Use the any() Function to Check if the Cell Is Empty

Another function that can allow us to check if any cell across the dataframe is empty is the any() function. As long as one cell is empty, the function returns True, and if otherwise, False.

We need to use the isnull() or isna() function to make it work. Both isnull().any() and isna().any() serve the purpose of identifying whether any empty cells exist in a DataFrame.

Let’s explore some common methods to utilize isnull().any() or isna().any() to identify empty cells in a Pandas DataFrame. We will use the same dataframe from the previous section.

To check for empty cells in a specific column, use isnull().any() or isna().any() on that column. These functions return a Boolean value indicating if any cell in the column is empty.

import pandas as pd

df = pd.DataFrame([[1, 2, 3], [3, 4], [3, 4, 5, 6]], columns=list("ABCD"))

any_empty_cells_B = df["B"].isnull().any()
print("Are there empty cells in column 'B'?", any_empty_cells_B)

any_empty_cells_D = df["D"].isna().any()
print("Are there empty cells in column 'D'?", any_empty_cells_D)

Output:

Are there empty cells in column 'B'? False
Are there empty cells in column 'D'? True

This code checks if there are any empty cells in columns B and D.

Both df['B'].isnull() and df['D'].isna() generate a Boolean Series indicating whether each cell in columns B and D is empty (True) or not (False).

Then, .any() returns True if any True values are present in the Boolean Series, indicating the presence of at least one empty cell in both columns. The result is stored in any_empty_cells_B and any_empty_cells_D.

Now, to check for empty cells in the entire DataFrame, apply isnull().any() or isna().any() on the DataFrame itself.

import pandas as pd

df = pd.DataFrame([[1, 2, 3], [3, 4], [3, 4, 5, 6]], columns=list("ABCD"))

any_empty_cells_df = df.isnull().any()
print("Are there empty cells in the DataFrame?\n", any_empty_cells_df)

Output:

Are there empty cells in the DataFrame?
A    False
B    False
C     True
D     True
dtype: bool

As we can see, this returned a Boolean value for each column, indicating if any cell in that column is empty.

Use NumPy isnan to Check if the Cell Is Empty

We can also use the isnan function in NumPy to check for missing or NaN (Not a Number) values in a DataFrame. This function allows us to identify empty cells in a specific column and store their indices for further analysis or manipulation.

We can store the indices of the empty cells into a list by using the index and apply() functions as well as the test numpy.isnan.

First, we will select the column and then use the index function to achieve this. Within index, we will select the same column again to use the apply() function with the test numpy.isnan.

These will be passed to the list() function.

Take a look at the example below:

import pandas as pd
import numpy as np

df = pd.DataFrame([[1, 2, 3], [3, 4], [3, 4, 5, 6]], columns=list("ABCD"))


def get_empty_cell_indices(column):
    return list(column.index[column.apply(np.isnan)])


empty_cell_indices = get_empty_cell_indices(df["D"])

print("Original DataFrame:")
print(df)

print("\nIndices of empty cells in column 'D':", empty_cell_indices)

Output:

Original DataFrame:
A  B    C    D
0  1  2  3.0  NaN
1  3  4  NaN  NaN
2  3  4  5.0  6.0

Indices of empty cells in column 'D': [0, 1]

Here, our focus is on checking the empty cells in a specific column, D. We want to retrieve the indices of the empty cells within this column.

The get_empty_cell_indices() function is defined to check for empty cells in a specified column and return their indices. In this example, the function is applied to column D of the DataFrame using the apply() function and isnan from NumPy.

The indices of the empty cells in column D are stored in a list using list(column.index[column.apply(np.isnan)]), which is then displayed along with the original DataFrame.

Conclusion

In this tutorial, we have delved into several approaches to check whether a cell in a Pandas DataFrame is empty or contains missing values. By utilizing functions such as isnull(), isna(), and any(), you can efficiently navigate and manage empty cells within your dataset.

Tailoring your approach based on the specific requirements of your analysis ensures data integrity and enhances the accuracy of your insights. With these methods, you are well-prepared to handle and process empty data effectively in your data analysis endeavors.

Olorunfemi Akinlua avatar Olorunfemi Akinlua avatar

Olorunfemi is a lover of technology and computers. In addition, I write technology and coding content for developers and hobbyists. When not working, I learn to design, among other things.

LinkedIn