PostgreSQL Distinct Count of Field Values

Joy Idialu Jul 01, 2022
  1. Get the Distinct Count of Field Values in PostgreSQL
  2. Get the Distinct Count of Field Values Based on Another Field in PostgreSQL
PostgreSQL Distinct Count of Field Values

Getting the distinct values in a field is an important query to know. This tutorial discusses how to get the distinct count of values in a field.

Get the Distinct Count of Field Values in PostgreSQL

Consider a quiz_score table that keeps track of the score of each participant in a quiz game.

id player_id score
1 1 10
2 2 10
3 3 18
4 4 69
5 5 24
6 6 67
7 7 94
8 8 68
9 9 33
10 10 5

Here is the CREATE statement for the table:

CREATE TABLE quiz_score
(
    id integer NOT NULL GENERATED ALWAYS AS IDENTITY,
    player_id integer NOT NULL,
    score integer NOT NULL,
    CONSTRAINT quiz_score_pkey PRIMARY KEY (id)
);

And here is the INSERT statement to populate the table with the data:

INSERT INTO quiz_score (player_id, score)
SELECT i, floor(random()*(100-0+1))
FROM generate_series(1,10000000) i;

We have inserted ten million random scores into our table. Let us find out how many distinct scores there are in our table by running this query:

SELECT COUNT(DISTINCT score) FROM quiz_score;

This query took 3 seconds and 391 milliseconds with a count of 101 distinct values.

We can also run this other query to find the number of distinct scores in the table:

SELECT COUNT(*) FROM (SELECT DISTINCT score FROM quiz_score) AS DistinctScores;

This new query took 1 second and 572 milliseconds with a count of 101 distinct values.

As we can see, the second query was faster. Either query works fine, but the second query was faster in this case.

Get the Distinct Count of Field Values Based on Another Field in PostgreSQL

We will now introduce a new column called Expertise. This field is populated based on the player’s score.

The Expertise values are: Beginner, Intermediary, Expert, and Master. The expertise of a player is determined by a player’s score, as shown here:

Expertise score
Beginner 0 - 50
Intermediary 51 - 80
Expert 81 - 90
Master 91 - 100

The newly updated quiz_score table is:

id player_id score player_rank
1 1 10 Beginner
2 2 10 Beginner
3 3 18 Beginner
4 4 69 Intermediary
5 5 24 Beginner
6 6 67 Intermediary
7 7 94 Master
8 8 68 Intermediary
9 9 33 Beginner
34 34 89 Expert

Here is the ALTER TABLE statement to add the new column to the table:

ALTER TABLE quiz_score ADD COLUMN expertise text;

And here is the UPDATE statement to populate the expertise field:

UPDATE quiz_score
SET expertise =
(CASE
    WHEN score >= 0
        AND score <= 50 THEN 'Beginner'
    WHEN score > 50
        AND score <= 80 THEN 'Intermediary'
    WHEN score > 80
        AND score <= 90 THEN 'Expert'
    WHEN score > 90 THEN 'Master'
END);

In the previous example, we looked at distinct scores in the table. In this example, let us find out how many distinct scores there are for each expertise in our table by running this query:

SELECT expertise, COUNT(DISTINCT score)
FROM quiz_score
GROUP BY expertise

Here is the result:

  expertise   | count
--------------+-------
 Beginner     |    51
 Intermediary |    10
 Expert       |    30
 Master       |    10

From this query, we can tell that the Beginner expertise has 51 distinct scores, Intermediary has 10 distinct scores, Expert has 30 distinct scores, and Master has 10 distinct scores. The query took 14 seconds and 515 milliseconds.

We can also run this other query to find the number of distinct scores in the table:

SELECT
    DISTINCT ON (expertise) expertise,
    COUNT(DISTINCT score)
FROM
    quiz_score
GROUP BY expertise

This new query took 12 seconds and 165 milliseconds. In the second example, the second query was faster, but either query works fine.

In this tutorial, we have discussed how to get distinct values in a field and how to get distinct values in a field based on distinct values of another field.