How to Create a Pivot Table in PostgreSQL

Bilal Shahid Feb 02, 2024
How to Create a Pivot Table in PostgreSQL

A pivot table tends to aggregate some values given in an original table already, used to summarize large data flows. In PostgreSQL, it is a table returned with custom N value columns, which is the data type of the row summarized in our result.

Today we will be learning how to create a pivot table in PostgreSQL using a predefined table with existing values.

Use crosstab() to Create a Pivot Table in PostgreSQL

You can find crosstab() under the TABLE_FUNC heading in the PostgreSQL documentation. It is a function that returns a table with multiple rows.

Syntax:

crosstab ( sql text ) ? setof record

This query produces a pivot table containing row names plus N value columns, where N is determined by the row type specified in the calling query.

crosstabN ( sql text ) , setof table_crosstab_N

This query produces a pivot table containing row names plus N value columns. crosstab2, crosstab3, and crosstab4 are predefined.

Now, let’s see how we can use it. We will create a simple table called APARTMENT with columns; ID, UNIT, PRICE, AREA.

Example:

CREATE TABLE apartment (
   ID int PRIMARY KEY,
   UNIT int,
   PRICE int,
   AREA int
)

Now, let’s add a few values to our table:

INSERT INTO apartment VALUES (1, 20, 200, 10) , (2, 20, 200, 9), (3, 50, 190, 8);

So, if we look at our table now, it would be something as follows.

Output:

id    unit   price  area
1     20     200     10
2     20     200     9
3     50     190     8

And in our pivot table, this table would be represented as:

unit    ..180    190        200          210...
20         -      -      19/2 = 9.5       -
50         -      8         -             -

So here, we are taking the average of the AREA for each UNIT at their specific PRICES. We want to see the average AREA that we get on the PRICE mentioned.

So if you find the average AREA for a PRICE of 200 for a UNIT of 20, you’ll get the average as (10 + 9) / 2 = 8. In this way, you find the pivot table with the aggregates.

Now, you have understood the concept, so let us go ahead and implement it. We can write a query as follows:

Select UNIT, PRICE, avg(area)
from apartment
group by UNIT, PRICE

This query will implement the basic functionality of what we want. It uses the GROUP BY clause to take those columns in the table against which we wish to aggregate.

We wish to find the average of the AREA for a specific UNIT and PRICE; hence, we group using these two columns. A table is returned as follows.

Output:

unit   price        avg
50     190     8.0000000000000000
20     200     9.5000000000000000

Now, this table works properly as well, but let’s go ahead and use the crosstab(). You have to look at how it works and the constraints.

To use crosstab(), we have to ensure two crucial points:

  1. crosstab() needs to have distinct values for each row.
  2. crosstab() needs to have the same data types for each column.

So, make sure that no column has a different data type. Now, to pivot our results from this GROUP BY query, let’s go ahead and write something as follows:

select *
from crosstab
(
'Select UNIT::float, PRICE::float, avg(AREA)::float
from apartment
group by UNIT, PRICE') as ct(
    UNIT float,
    avge float
);

There are some critical points to consider here. We select the columns from the crosstab returned as the result table CT as defined by us with two columns: UNIT and avge.

After grouping, we get the columns from the APARTMENT and find the average. This query table is then pivoted for our final result.

Also, notice that we ensure to cast each column to the same data type. Of course, an average will either be a double-precision or float.

Hence, it is better to cast the int columns to float, even if unnecessary. If you run the query without casting, it will return an error as below.

Output:

ERROR:  return and sql tuple descriptions are incompatible
SQL state: 42601

Hence to avoid this, we make sure to cast to float or create a new table with values inserted into it with the type float. The difference in data types is what makes the tuple incompatible.

Output:

unit    avg
50       8
20      9.5

In other cases, there is no need to use crosstab(). Even using the GROUP BY is effective as it returns us a table with unique values for getting an average.

Without using the crosstab() function, we still would have gotten a result as follows:

unit   price   avg
50     190      8
20     200     9.5

You can notice in crosstab() that using a return list with three columns as we would want still returns us just values that contain the UNIT and average of the area. Why?

The SQL statement takes a set of categories and values, and in our case, the category becomes the UNIT, and the values tend to be the average column. Hence, only two columns are kept in our return table.

We can use it now that we know how crosstab() works for our needs. If we had UNITS with different PRICES and then took the AVG(), we would have an N number of columns, depending on our data.

Let’s change the values in our table to be as follows:

id    unit   price  area
1      20    200     10
2      20    170     9
3      50    190     8

Now, running the crosstab() query on this will return something as follows.

Output:

unit   avge   avge1
50      8    [NULL]
20     10      9

You can notice that changing the PRICE tends to spread out the AREA values into different averages. This happens because we also group our average by prices.

Same PRICE values will have an average of different values if present for all of them, but other PRICE values will only consist of averages of AREAS respective to each PRICE. Hence the table is sorted in this way.

Author: Bilal Shahid
Bilal Shahid avatar Bilal Shahid avatar

Hello, I am Bilal, a research enthusiast who tends to break and make code from scratch. I dwell deep into the latest issues faced by the developer community and provide answers and different solutions. Apart from that, I am just another normal developer with a laptop, a mug of coffee, some biscuits and a thick spectacle!

GitHub

Related Article - PostgreSQL Table