How to Use Pipeline in Lookup Operator in MongoDB

Mehvish Ashiq Feb 02, 2024
  1. What Is the Aggregation Pipeline
  2. What Is the $lookup Operator in MongoDB
  3. Use Pipeline in $lookup Operator to Join Conditions in MongoDB
How to Use Pipeline in Lookup Operator in MongoDB

This tutorial teaches how to use the pipeline in the lookup operator in MongoDB. Before moving on, it is mandatory to have enough knowledge about the aggregation pipeline and $lookup operator to understand the use of the pipeline in the $lookup operator while using MongoDB.

If you already know these concepts, you can quickly move to this tutorial’s last two code examples.

What Is the Aggregation Pipeline

It is a procedure of collecting data and returning the computed results. This process collects data from different documents, groups them as per specified conditions, and perform various kinds of operation on the grouped data.

For instance, average, sum, maximum and minimum. It is just like the SQL aggregation functions.

In MongoDB, we can use aggregation in the following three ways.

  1. Aggregation Pipeline - contains various stages to transform the provided documents. Every stage accepts the set of documents and generates another set of resultant documents that are further passed to the next stage, and this process continues until the final stage.

  2. Map-reduce Function - We use this function to aggregate results on a large scale. It has two functions, map and reduce.

    The map method groups all documents while the reduce method performs operations on grouped data.

  3. Single-purpose Aggregation - the simplest form of aggregation used to perform aggregation tasks but lacks some features compared to the aggregation pipeline method. We use this type of aggregation to perform tasks within a particular document, for instance, count the distinct values within a specific document(s).

You may also read this to dive deeper and know aggregation pipelines.

What Is the $lookup Operator in MongoDB

This operator is used to perform left outer join to merge data from one document to another within the same database. It filters the documents from a joined collection for further processing.

We can also use this operator to add an extra field to an existing document.

The $lookup operator adds a new array attribute (field) whose values (elements) match documents from the joined collection. Then these transformed documents are passed to the next stage.

The $lookup operator has three different syntaxes we can use considering the project requirements. This tutorial uses the $lookup syntax for Join conditions & Subqueries on the Joined Collection.

To practice with an example code, let’s prepare the sample collections with data.

Example Code:

db.createCollection('collection1');
db.createCollection('collection2');

db.collection1.insertMany([
    {"shopId": "001", "shopPosId": "001", "description": "description for 001"},
    {"shopId": "002", "description": "description for 002"},
    {"shopId": "003", "shopPosId": "003", "description": "description for 003"},
    {"shopId": "004", "description": "description for 004"}
]);

db.collection2.insertMany([
    {"shopId": "001", "shopPosId": "0078", "clientUid": "474192"},
    {"shopId": "002", "shopPosId": "0012", "clientUid": "474193"},
    {"shopId": "003", "shopPosId": "0034", "clientUid": "474194"},
    {"shopId": "004", "shopPosId": "0056", "clientUid": "474195"}
]);

Now, we can execute the following commands to see the inserted documents in each collection.

db.collection1.find();
db.collection2.find();

Use Pipeline in $lookup Operator to Join Conditions in MongoDB

To learn how to use pipeline in the $lookup operator, let’s join documents from two collections where the collection1.shopId is equal to the collection2.shopId, and the collection1 does not contain the shopPosId field.

Only those documents will be joined from both collections that will satisfy both conditions. See the example code given below.

Example Code:

db.collection2.aggregate([
    {
        "$lookup": {
        "from": "collection1",
        "let": { "shopId": "$shopId" },
        "pipeline": [{
            "$match": {
                "$and": [
                    {"$expr": {"$eq": ['$shopId', '$$shopId'] }},
                    { "shopPosId": { "$exists": false } }
                ]
            }
         }],
         "as": "shopDescription"
        }
    }
]).pretty();

Output:

use pipeline in lookup operator in mongodb - output 1

Did you carefully observe the output given above? Only those documents are joined from both collections that meet both conditions in the pipeline (the collection1.shopId is equal to the collection2.shopId, and the collection1 does not contain the shopPosId field).

Further, those documents that do not match these conditions have an empty array named shopDescription (see the red boxes in the above results). We can display only those resultant documents containing a non-empty shopDescription array (see the following query).

Example Code:

db.collection2.aggregate([
    {
        "$lookup": {
        "from": "collection1",
        "let": { "shopId": "$shopId" },
        "pipeline": [{
            "$match": {
                "$and": [
                    {"$expr": {"$eq": ['$shopId', '$$shopId'] }},
                    { "shopPosId": { "$exists": false } }
                ]
            }
         }],
         "as": "shopDescription"
        }
    },
    {
        "$match":{
            "shopDescription": { $exists: true, $not: {$size: 0} }
         }
    }
]).pretty();

Output:

use pipeline in lookup operator in mongodb - output2

Mehvish Ashiq avatar Mehvish Ashiq avatar

Mehvish Ashiq is a former Java Programmer and a Data Science enthusiast who leverages her expertise to help others to learn and grow by creating interesting, useful, and reader-friendly content in Computer Programming, Data Science, and Technology.

LinkedIn GitHub Facebook