MongoDB Distinct Aggregation

Mehvish Ashiq Jun 27, 2022
  1. MongoDB Distinct Aggregation
  2. Use the distinct() Method to Find Distinct Values in MongoDB
  3. Use the distinct Command to Find Distinct Values in MongoDB
  4. Use the $group Aggregation Operator to Find Distinct Values in MongoDB
MongoDB Distinct Aggregation

This article talks about MongoDB’s distinct aggregation. In this tutorial, we will learn different approaches that we can use to find unique/distinct values based on the project requirements.

MongoDB Distinct Aggregation

We must have a populated collection to learn MongoDB distinct aggregation. For that, we have created a collection named collection with sample data you may use.

Example Code:

// MongoDB version 5.0.8

//insert documents
> db.collection.insertMany([
    { "_id": 1, "state": "punjab", "city": "lahore", "zipcode" : 94608 },
    { "_id": 2, "state": "punjab", "city": "lahore", "zipcode" : 94608 },
    { "_id": 3, "state": "punjab", "city": "lahore", "zipcode" : 99559 },
    { "_id": 4, "state": "punjab", "city": "lahore", "zipcode" : 99559 },

    { "_id": 5, "state": "punjab", "city": "karachi", "zipcode" : 99523 },
    { "_id": 6, "state": "punjab", "city": "karachi", "zipcode" : 94608 },
    { "_id": 7, "state": "punjab", "city": "karachi", "zipcode" : 99559 },
    { "_id": 8, "state": "punjab", "city": "karachi", "zipcode" : 99545 },

    { "_id": 9, "state": "punjab", "city": "multan", "zipcode" : 99545 },
    { "_id": 10, "state": "punjab", "city": "multan", "zipcode" : 94608 },
    { "_id": 11, "state": "punjab", "city": "multan", "zipcode" : 99559 },
    { "_id": 12, "state": "punjab", "city": "multan", "zipcode" : 99559 }
]);

You can also execute the following command to look at the inserted data.

// MongoDB version 5.0.8

//display documents
> db.collection.find();

Use the distinct() Method to Find Distinct Values in MongoDB

Example Code 1:

// MongoDB version 5.0.8

// return the distinct values for the `zipcode` field across the collection
> db.collection.distinct( "zipcode");

OUTPUT:

[ 94608, 99523, 99545, 99559 ]

Example Code 2:

// MongoDB version 5.0.8

// returns unique `city` names where the `zipcode` is either 99559 or 99545 or both.
> db.collection.distinct( "city", {"zipcode": { $in: [99559,99545]}} );

OUTPUT:

[ "karachi", "lahore", "multan" ]

Example Code 3:

// MongoDB version 5.0.8

// use the `.length` property to obtain the size/length of the retrieved unique values
> db.collection.distinct( "city", {"zipcode": { $in: [99559,99545]}} ).length;

OUTPUT:

3

The distinct() method extracts the distinct values for the specified field across one collection or view and returns the output in an array.

It takes at least one parameter and goes up to three, where the first, second, and third parameters are field, query, and options, respectively.

The field parameter is the field name for which we should return unique (distinct) values. The query tells the documents from which we need to get the distinct values.

While the options parameter is optional, it is a document that specifies the options. The one using this method must have the following points in mind:

  1. The db.collection.distinct() provides the wrapper around a distinct command that we will cover in the next section.
  2. If the specified field’s value is an array, then the db.collection.distinct() will consider every element of an array as a separate value. For example, if the field contains its value as [2, [2], 2], then the db.collection.distinct() will take 2, [2], and 2 as separate values.

Use the distinct Command to Find Distinct Values in MongoDB

Example Code:

// MongoDB version 5.0.8

// find distinct values
> db.runCommand ( { distinct: "collection", key: "zipcode" } );

OUTPUT:

{ "values" : [ 94608, 99523, 99545, 99559 ], "ok" : 1 }

In this code, we used the distinct aggregation command that finds all the distinct values for the specified field across one collection. As a result, it returns one document with an array of distinct values.

It also has an embedded document with the query plan and query statistics.

Use the $group Aggregation Operator to Find Distinct Values in MongoDB

Example Code:

// MongoDB version 5.0.8

// find size of distinct values for each `city`
> db.collection.aggregate([
    {
        $group:{
            "_id": "$city",
            "unique_count": {
                $addToSet: "$zipcode"
             }
        }
    },
    {
        $project:{
            "distinct_zipcodes":{
                $size: "$unique_count"
             }
        }
    }
]);

OUTPUT:

{ "_id" : "karachi", "distinct_zipcodes" : 4 }
{ "_id" : "lahore", "distinct_zipcodes" : 2 }
{ "_id" : "multan", "distinct_zipcodes" : 3 }

This code example returns the distinct value of the zipcode field for every city. Read the following points to understand what is going on in the code.

  1. The $addToSet keeps adding values to the array until the value already exists. The $addToSet aggregation returns an array containing all the unique values we save in the unique_count variable.
  2. We make the city names as _id and use the $group aggregation operator to group them using _id and unique_count.
  3. Next, we use $project to specify the inclusion/suppression of the field, adding a new field, and resetting the value of an existing field. Here, we use $size to find the size of the array named $unique_count and save it into the distinct_zipcodes variable that is further displayed on the screen next to the _id field (see the output given above).
Mehvish Ashiq avatar Mehvish Ashiq avatar

Mehvish Ashiq is a former Java Programmer and a Data Science enthusiast who leverages her expertise to help others to learn and grow by creating interesting, useful, and reader-friendly content in Computer Programming, Data Science, and Technology.

LinkedIn GitHub Facebook