How to Visualize KDE Plot With Pandas and Seaborn

Samreena Aslam Feb 02, 2024
  1. Data Visualization Using Normal KDE Plot and Seaborn in Python
  2. One-Dimensional KDE Plot Using Pandas and Seaborn in Python
  3. Two-Dimensional or Bivariate KDE Plot Using Pandas and Seaborn in Python
  4. Conclusion
How to Visualize KDE Plot With Pandas and Seaborn

KDE is Kernel Density Estimate, used to visualize the probability density of continuous and non-parametric data variables. When you want to visualize the multiple distributions, the KDE function produces a less cluttered plot that is more interpretable.

Using KDE, we can visualize multiple data samples using a single graph plot, which is a more efficient method in data visualization.

Seaborn is a python library like matplotlib. Seaborn can be integrated with pandas and numpy for data representations.

Data scientists use this library to create informative and beautiful statistical charts and graphs. Using these presentations, you can understand the clear concepts and flow of information within different modules.

We can plot univariate and bivariate graphs using the KDE function, Seaborn, and Pandas.

We will learn about the KDE plot visualization with pandas and seaborn. This article will use a few samples of the mtcars dataset to show the KDE plot visualization.

Before starting with the details, you need to install or add the seaborn and sklearn libraries using the pip command.

pip install seaborn
pip install sklearn

Data Visualization Using Normal KDE Plot and Seaborn in Python

We can plot the data using the normal KDE plot function with the Seaborn library.

In the following example, we have created 1000 data samples using the random library then arranged them in the array of numpy because the Seaborn library only works well with numpy and Pandas dataframes.

Example Code:

import seaborn as sn
import matplotlib.pyplot as plt
import numpy as np

data = np.random.randn(1000)
# KDE Plot with seaborn
res = sn.kdeplot(data, color="red", shade="True")
plt.show()

Output:

KDE plot visualization with pandas and seaborn - KDE plot with Seaborn

We can also visualize the above data sample vertically or revert the above plot using the KDE and Seaborn library. We used the plot property vertical=True to revert the above plot.

Example Code:

import seaborn as sn
import matplotlib.pyplot as plt
import numpy as np

data = np.random.randn(1000)
# KDE Plot with seaborn
res = sn.kdeplot(data, color="green", vertical=True, shade="True")
plt.show()

Output:

KDE plot visualization with pandas and seaborn - KDE plot with Seaborn

One-Dimensional KDE Plot Using Pandas and Seaborn in Python

We can visualize the probability distribution for a single target or continuous attribute using the KDE plot. In the following example, we have read a CSV file of the mtcars dataset.

There are more than 350 entries in our dataset, and we will visualize the univariate distribution along the x-axis.

Example Code:

import seaborn as sn
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

# read CSV file of dataset using pandas
dataset = pd.read_csv(r"C:\\Users\\DELL\\OneDrive\\Desktop\\samplecardataset.csv")
# kde plot using seaborn
sn.kdeplot(data=dataset, x="hp", shade=True, color="red")
plt.show()

Output:

KDE plot visualization with pandas and seaborn - KDE plot with Seaborn

You can also flip the plot by visualizing the data variable along the y-axis.

Example Code:

import seaborn as sn
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

# Read CSV file using pandas
dataset = pd.read_csv(r"C:\\Users\\DELL\\OneDrive\\Desktop\\samplecardataset.csv")
# KDE plotting using seaborn
sn.kdeplot(data=dataset, y="hp", shade=True, color="red")
plt.show()

Output:

KDE plot visualization with pandas and seaborn - KDE plot with Seaborn

We can visualize the probability distribution of multiple target values in a single plot.

Example Code:

import seaborn as sn
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

# Read CSV file using pandas
dataset = pd.read_csv(r"C:\\Users\\DELL\\OneDrive\\Desktop\\samplecardataset.csv")
# KDE plotting using seaborn
sn.kdeplot(data=dataset, x="hp", shade=True, color="red")
sn.kdeplot(data=dataset, x="mpg", shade=True, color="green")
sn.kdeplot(data=dataset, x="disp", shade=True, color="blue")
plt.show()

Output:

KDE plot visualization with pandas and seaborn - KDE plot with Seaborn

Two-Dimensional or Bivariate KDE Plot Using Pandas and Seaborn in Python

We can visualize data in two-dimensional or bivariate KDE plots using the seaborn and pandas library.

In this way, we can visualize the probability distribution of a given sample against multiple continuous attributes. We visualized the data along the x and y-axis.

Example Code:

import seaborn as sn
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

# Read CSV file using pandas
dataset = pd.read_csv(r"C:\\Users\\DELL\\OneDrive\\Desktop\\samplecardataset.csv")
# KDE plotting using seaborn
sn.kdeplot(data=dataset, shade=True, x="hp", y="mpg")
plt.show()

Output:

KDE plot visualization with pandas and seaborn - KDE plot with Seaborn

Similarly, we can plot the probability distribution of multiple samples using a single KDE plot.

Example Code:

import seaborn as sn
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

# Read CSV file using pandas
dataset = pd.read_csv(r"C:\\Users\\DELL\\OneDrive\\Desktop\\samplecardataset.csv")
# KDE plotting using seaborn
sn.kdeplot(data=dataset, shade=True, x="hp", y="mpg", cmap="Blues")
sn.kdeplot(data=dataset, shade=True, x="hp", y="cyl", cmap="Greens")
plt.show()

Output:

KDE plot visualization with pandas and seaborn - KDE plot with Seaborn

Conclusion

We demonstrated in this tutorial using the KDE plot visualization using Pandas and Seaborn library. We have seen how to visualize the probability distribution of single and multiple samples in a one-dimensional KDE plot.

We discussed how to use the KDE plot with Seaborn and Pandas to visualize the two-dimensional data.

Related Article - Pandas DataFrame