Python Pandas pandas.pivot_table() Function

Minahil Noor Jan 30, 2023

Pandas Pandas Core

Syntax of pandas.pivot_table()
Example Codes: pandas.pivot_table()
Example Codes: pandas.pivot_table() to Specify Multiple Aggregate Function
Example Codes: pandas.pivot_table() to Use the margins Parameter

Python Pandas pandas.pivot_table() function avoids the repetition of data of the DataFrame. It summarizes the data and applies different aggregate functions on the data.

Syntax of `pandas.pivot_table()`

pandas.pivot_table(
    data,
    values=None,
    index=None,
    columns=None,
    aggfunc="mean",
    fill_value=None,
    margins=False,
    dropna=True,
    margins_name="All",
    observed=False,
)

Parameters

This function has several parameters. The default values of all the parameters are mentioned above.


`data`	It is the `DataFrame` from which we want to remove the repeated data.
`values`	It represents the column to aggregate.
`index`	It is a `column`, `grouper`, array, or a list. It represents the data column that we want as index i.e as rows.
`columns`	It is a `column`, `grouper`, array, or a list. It represents the data column that we want as columns in our output pivot table.
`aggfunc`	It is a function, list of functions, or a dictionary. It represents the aggregate function that will be applied to the data. If a list of aggregate functions is passed then there will be a column for each aggregate function in the resultant table with the column name on the top.
`fill_value`	It is a scalar. It represents the value that will replace the missing values in the output table
`margins`	It is a boolean value. It represents the row and column generated after taking the sum of the respective row and column.
`dropna`	It is a boolean value. It eliminates the columns whose values are `NaN` from the output table.
`margins_name`	It is a string. It represents the name of the row and column generated if `margins` value is `True`.
`observed`	It is a boolean value. If any grouper is categorical then this parameter applies. If it is `True`, it shows the observed values for categorical groupers. If it is `False`, it shows all values for categorical groupers

Return

It returns the summarized DataFrame.

Example Codes: `pandas.pivot_table()`

Let’s dig deeper into this function by implementing it.

import pandas as pd

dataframe = pd.DataFrame({
                            "Name": 
                                ["Olivia", 
                                "Olivia", 
                                "Olivia", 
                                "Olivia", 
                                "Meghan", 
                                "Meghan", 
                                "Meghan", 
                                "Meghan"],
                            "Date": 
                                ["03-06-2019", 
                                "04-06-2019", 
                                "03-06-2019", 
                                "04-06-2019", 
                                "03-06-2019", 
                                "04-06-2019", 
                                "03-06-2019", 
                                "04-06-2019"],
                            "Science Marks": 
                                [10, 
                                2, 
                                4, 
                                6, 
                                8, 
                                9, 
                                1, 
                                10]
                        })
print(dataframe)

The example DataFrame is,

     Name        Date  Science Marks
0  Olivia  03-06-2019             10
1  Olivia  04-06-2019              2
2  Olivia  03-06-2019              4
3  Olivia  04-06-2019              6
4  Meghan  03-06-2019              8
5  Meghan  04-06-2019              9
6  Meghan  03-06-2019              1
7  Meghan  04-06-2019             10

Note that the above data contains the same value in a column multiple times. This pivot_table function will summarize this data.

import pandas as pd

dataframe = pd.DataFrame(
    {
        "Name": [
            "Olivia",
            "Olivia",
            "Olivia",
            "Olivia",
            "Meghan",
            "Meghan",
            "Meghan",
            "Meghan",
        ],
        "Date": [
            "03-06-2019",
            "04-06-2019",
            "03-06-2019",
            "04-06-2019",
            "03-06-2019",
            "04-06-2019",
            "03-06-2019",
            "04-06-2019",
        ],
        "Science Marks": [10, 2, 4, 6, 8, 9, 1, 10],
    }
)

pivotTable = pd.pivot_table(dataframe, index="Name", columns="Date")
print(pivotTable)

Output:

              Science Marks           
Date      03-06-2019 04-06-2019
Name                           
Meghan           4.5        9.5
Olivia           7.0        4.0

Here, we have chosen the Name column as the index and the Date as the column. The function has generated the result based on the default parameters. The default aggregate function mean() has calculated the mean of the values.

Example Codes: `pandas.pivot_table()` to Specify Multiple Aggregate Function

import pandas as pd

dataframe = pd.DataFrame(
    {
        "Name": [
            "Olivia",
            "Olivia",
            "Olivia",
            "Olivia",
            "Meghan",
            "Meghan",
            "Meghan",
            "Meghan",
        ],
        "Date": [
            "03-06-2019",
            "04-06-2019",
            "03-06-2019",
            "04-06-2019",
            "03-06-2019",
            "04-06-2019",
            "03-06-2019",
            "04-06-2019",
        ],
        "Science Marks": [10, 2, 4, 6, 8, 9, 1, 10],
    }
)

pivotTable = pd.pivot_table(
    dataframe, index="Name", columns="Date", aggfunc=["sum", "count"]
)
print(pivotTable)

Output:

                   sum                    count           
              Science Marks            Science Marks           
Date      03-06-2019 04-06-2019    03-06-2019 04-06-2019
Name                                                    
Meghan             9         19             2          2
Olivia            14          8             2          2

We have used two aggregate functions. The columns of these functions are generated separately.

Example Codes: `pandas.pivot_table()` to Use the `margins` Parameter

import pandas as pd

dataframe = pd.DataFrame(
    {
        "Name": [
            "Olivia",
            "Olivia",
            "Olivia",
            "Olivia",
            "Meghan",
            "Meghan",
            "Meghan",
            "Meghan",
        ],
        "Date": [
            "03-06-2019",
            "04-06-2019",
            "03-06-2019",
            "04-06-2019",
            "03-06-2019",
            "04-06-2019",
            "03-06-2019",
            "04-06-2019",
        ],
        "Science Marks": [10, 2, 4, 6, 8, 9, 1, 10],
    }
)

pivotTable = pd.pivot_table(
    dataframe, index="Name", columns="Date", aggfunc=["sum", "count"], margins=True
)
print(pivotTable)

Output:

                   sum                        count               
              Science Marks                Science Marks               
Date      03-06-2019 04-06-2019 All    03-06-2019 04-06-2019 All
Name                                                            
Meghan             9         19  28             2          2   4
Olivia            14          8  22             2          2   4
All               23         27  50             4          4   8

The margins parameter has generated a new row named All and a new column named also as All which shows the sum of the row and column respectively.

Syntax of pandas.pivot_table()

Parameters

Return

Example Codes: pandas.pivot_table()

Example Codes: pandas.pivot_table() to Specify Multiple Aggregate Function

Example Codes: pandas.pivot_table() to Use the margins Parameter

Related Article - Pandas Core

Syntax of `pandas.pivot_table()`

Example Codes: `pandas.pivot_table()`

Example Codes: `pandas.pivot_table()` to Specify Multiple Aggregate Function

Example Codes: `pandas.pivot_table()` to Use the `margins` Parameter