Unpivot a Python Pandas Dataframe

Unpivot a Python Pandas Dataframe

  1. Use the melt() Function to Unpivot a Python Pandas Dataframe
  2. Conclusion

Python is a preferred language for data analysis due to its great data-centric packages ecosystem. One of these packages is Pandas, which significantly simplifies data import and analysis.

This article uses the Pandas dataframe melt() function to convert a dataframe from wide to long format while optionally leaving identifier variables set.

Use the melt() Function to Unpivot a Python Pandas Dataframe

This function can convert a dataframe into a format in which one or more columns are identifier variables (id vars). All other columns, considered measured variables (value vars), are unpivoted to the row axis, leaving only two non-identifier columns, variable and value.

Syntax:

DataFrame.melt(id_vars=None, value_vars=None, var_name=None, value_name='value', col_level=None)

Parameters:

  • frame:‎ This is the DataFrame we use.‎
  • id_vars:‎‎ This is the column we use as an identifier variable.
  • value_vars:‎‎ This is used to unpivot.‎
  • var_name: ‎This specifies a name for the variable column. If none is specified, the frame is used, variable, or columns name.
  • value_name:‎‎ The value column should be given a unique name.
  • col_level:‎ Use this level to melt the columns if they are MultiIndexes.
  • Returns: Convert a DataFrame into a format with one or more identifier variables in each column.

Example 1:

First, create a dataframe with columns A, Bsc, Cse, and xyz.

import pandas as pd
df = pd.DataFrame({"A":[12, 4, 5, 44, 1],
                "Bsc":[5, 2, 54, 3, 2],
                "Cse":[20, 16, 7, 3, 8],
                "xyz":[14, 3, 17, 2, 6]})
df

Output:

|      | A    | Bsc  | Cse  | xyz  |
| ---- | ---- | ---- | ---- | ---- |
| 0    | 12   | 5    | 20   | 14   |
| 1    | 4    | 2    | 16   | 3    |
| 2    | 5    | 54   | 7    | 17   |
| 3    | 44   | 3    | 3    | 2    |
| 4    | 1    | 2    | 8    |      |

The below function unpivots the dataframe. Here A is an identifier, and the Bsc column is the value variable.

df.melt(id_vars =['A'], value_vars =['Bsc'])

Output:

|      | A    | variable | value |
| ---- | ---- | -------- | ----- |
| 0    | 12   | Bsc      | 5     |
| 1    | 4    | Bsc      | 2     |
| 2    | 5    | Bsc      | 54    |
| 3    | 44   | Bsc      | 3     |
| 4    | 1    | Bsc      | 2     |

Example 2:

Set column A as the identifier variable and columns Bsc and Csc as the value variables using the dataframe.melt() function. ‎‎‎‎

df.melt(id_vars =['A'], value_vars =['Bsc', 'Cse'],
        var_name ='Variable_column', value_name ='Value_column')

Output:

| index | A    | Variable_column | Value_column |
| ----- | ---- | --------------- | ------------ |
| 0     | 12   | Bsc             | 5            |
| 1     | 4    | Bsc             | 2            |
| 2     | 5    | Bsc             | 54           |
| 3     | 44   | Bsc             | 3            |
| 4     | 1    | Bsc             | 2            |
| 5     | 12   | Cse             | 20           |
| 6     | 4    | Cse             | 16           |
| 7     | 5    | Cse             | 7            |
| 8     | 44   | Cse             | 3            |
| 9     | 1    | Cse             |              |

Conclusion

In this article, we discussed how to unpivot a dataframe or convert a dataframe into a long format to increase the number of rows.

For this purpose, we used the melt() function. One column is designated as a variable identifier, while the other is the identifier’s value.

Related Article - Pandas Dataframe

  • Pandas groupby() and diff()
  • Create Nested Dataframes in Pandas
  • Pandas Reverse Dataframe
  • Save Pandas Dataframe Table as a PNG
  • Convert Spark List to Pandas Dataframe
  • Determine the Length of List in Python Pandas Dataframe