MongoDB Project Nested Fields

Mehvish Ashiq Jun 22, 2022
  1. MongoDB Project Nested Fields
  2. Use the $project Aggregation Stage to Project Nested Fields in MongoDB
  3. Use $unset Aggregation Stage to Get Nested Fields Excluding the Specified Ones in MongoDB
  4. Use a forEach() Loop to Get Nested Fields in MongoDB
  5. Use the mapReduce() Method to Project Nested Fields in MongoDB
MongoDB Project Nested Fields

Today, we will learn how to use the $project and $unset aggregation stages, the forEach() loop, and the mapReduce() method to project nested fields while querying data in MongoDB.

MongoDB Project Nested Fields

In MongoDB, we can retrieve all documents using the find() method, but what if we only want access to specific nested fields. This is where we use projection.

We can project nested fields in various ways. Here, we will learn about the following solutions to project nested fields.

  1. Use the $project aggregation stage
  2. Use the $unset aggregation stage
  3. Use the forEach() loop
  4. Use the mapReduce() function

To learn the above approaches, let’s create a collection named nested containing one document. You may also use the query given below to follow up with us.

Example Code:

// MongoDB version 5.0.8

> db.nested.insertOne(
    {
        "name": {
            "first_name": "Mehvish",
            "last_name": "Ashiq",
         },
         "contact": {
            "phone":{"type": "manager", "number": "123456"},
            "email":{ "type": "office", "mail": "delfstack@example.com"}
         },
         "country_name" : "Australien",
         "posting_locations" : [
             {
                 "city_id" : 19398,
                 "city_name" : "Bondi Beach (Sydney)"
             },
             {
                  "city_id" : 31101,
                  "city_name" : "Rushcutters Bay (Sydney)"
             },
             {
                  "city_id" : 31022,
                  "city_name" : "Wolly Creek (Sydney)"
             }
          ],
          "regions" : {
              "region_id" : 796,
              "region_name" : "Australien: New South Wales (Sydney)"
          }
    }
);

Use db.nested.find().pretty(); on mongo shell to see the inserted data.

Use the $project Aggregation Stage to Project Nested Fields in MongoDB

Example Code:

// MongoDB version 5.0.8

> var current_location = "posting_locations";
> var project = {};
> project["id"] = "$"+current_location+".city_id";
> project["name"] = "$"+current_location+".city_name";
> project["regions"] = 1;

> var find = {};
> find[current_location] = {"$exists":true};

> db.nested.aggregate([
    { $match : find },
    { $project : project }
]).pretty()

OUTPUT:

{
        "_id" : ObjectId("62a96d397c7e3688aea26d0d"),
        "regions" : {
                "region_id" : 796,
                "region_name" : "Australien: New South Wales (Sydney)"
        },
        "id" : [
                19398,
                31101,
                31022
        ],
        "name" : [
                "Bondi Beach (Sydney)",
                "Rushcutters Bay (Sydney)",
                "Wolly Creek (Sydney)"
        ]
}

Here, we save the first-level field named posting_locations in a variable called current_location.

Then, we use that variable to access the city_id and city_name and save them in the project object while using bracket notation to create properties for the project object. Additionally, we save the regions field in the project["regions"].

Next, we have another object named find that we will use in the aggregate() method to match the documents. In the aggregate() method, we use the $match stage to match the documents and $project to project the fields, whether nested or at the first level.

We use $project to specify what fields we want to display in the output. We can use the following solution if we are interested in projecting the specified nested fields only without any filter query.

Example Code:

// MongoDB version 5.0.8

> var current_location = "posting_locations";
> db.nested.aggregate({
    $project: {
         "_id": 0,
         "city_id": "$" + current_location + ".city_id",
         "city_name": "$" + current_location + ".city_name",
         "regions": 1
    }
}).pretty();

OUTPUT:

{
        "regions" : {
                "region_id" : 796,
                "region_name" : "Australien: New South Wales (Sydney)"
        },
        "city_id" : [
                19398,
                31101,
                31022
        ],
        "city_name" : [
                "Bondi Beach (Sydney)",
                "Rushcutters Bay (Sydney)",
                "Wolly Creek (Sydney)"
        ]
}

Use $unset Aggregation Stage to Get Nested Fields Excluding the Specified Ones in MongoDB

Example Code:

// MongoDB version 5.0.8

> db.nested.aggregate({
        $unset: ["posting_locations.city_id", "contact", "regions", "name", "_id"]
}).pretty()

OUTPUT:

{
        "country_name" : "Australien",
        "posting_locations" : [
                {
                        "city_name" : "Bondi Beach (Sydney)"
                },
                {
                        "city_name": "Rushcutters Bay (Sydney)"
                },
                {
                        "city_name": "Wolly Creek (Sydney)"
                }
        ]
}

Here, we use the $unset operator, which is used to delete the specified field or array of fields.

Remember that we use the dot notation to specify the embedded documents or array of documents. The $unset operator does no operation if the given field does not exist.

When we use $ to match the elements of an array, the $unset operator replaces matching elements with null instead of removing them from the array. This behavior assists in keeping the element positions and array size consistent.

Use a forEach() Loop to Get Nested Fields in MongoDB

Example Code:

// MongoDB version 5.0.8

> var bulk = db.newcollection.initializeUnorderedBulkOp(),
   counter = 0;

> db.nested.find().forEach(function(doc) {
    var document = {};
    document["name"] = doc.name.first_name + " " + doc.name.last_name;
    document["phone"] = doc.contact.phone.number;
    document["mail"] = doc.contact.email.mail;
    bulk.insert(document);
    counter++;
    if (counter % 1000 == 0) {
        bulk.execute();
        bulk = db.newcollection.initializeUnorderedBulkOp();
    }
});

> if (counter % 1000 != 0) { bulk.execute(); }

You will see something similar to the following.

BulkWriteResult({
        "writeErrors" : [ ],
        "writeConcernErrors" : [ ],
        "nInserted" : 1,
        "nUpserted" : 0,
        "nMatched" : 0,
        "nModified" : 0,
        "nRemoved" : 0,
        "upserted" : [ ]
})

Next, execute the command below on your mongo shell to see the projected fields.

// MongoDB version 5.0.8

> db.newcollection.find().pretty();

OUTPUT:

{
        "_id" : ObjectId("62a96f2d7c7e3688aea26d0e"),
        "name" : "Mehvish Ashiq",
        "phone" : "123456",
        "mail" : "delfstack@example.com"
}

To learn this example code, suppose we want to grab certain nested fields and insert them into a new collection. Here, inserting the transformed fields as a document to a new collection may impact our operations based on the size of the nested collection.

We can avoid this slow insert performance by using a new unordered bulk insert API. It will streamline the insert operations by sending in bulk and give us feedback in real-time about whether the operation succeeded or failed.

So, we are using bulk insert API to insert the desired data structure into the newcollection collection, where the brand new documents will be created with the nested collection cursor’s forEach() loop. To create new properties, we use the bracket notation.

For this code, we assume to have a large amount of data. So, we will send the operations to a server in 1000’s batches to perform the bulk insert operation.

As a result, it gives us good performance because we are not sending each request but just once for every 1000 requests to the server.

Use the mapReduce() Method to Project Nested Fields in MongoDB

Example Code:

// MongoDB version 5.0.8

> function map() {
    for(var i in this.posting_locations) {
         emit({
             "country_id" : this.country_id,
             "city_id" : this.posting_locations[i].city_id,
             "region_id" : this.regions.region_id
         },1);
    }
}

> function reduce(id,docs) {
      return Array.sum(docs);
}

> db.nested.mapReduce(map,reduce,{ out : "map_reduce_output" } )

Now, run the following query to see the output.

// MongoDB version 5.0.8
> db.map_reduce_output.find().pretty();

OUTPUT:

{
        "_id" : {
                "country_id" : undefined,
                "city_id" : 19398,
                "region_id" : 796
        },
        "value" : 1
}
{
        "_id" : {
                "country_id" : undefined,
                "city_id" : 31022,
                "region_id" : 796
        },
        "value" : 1
}
{
        "_id" : {
                "country_id" : undefined,
                "city_id" : 31101,
                "region_id" : 796
        },
        "value" : 1
}

For this example code, we use the mapReduce() function to perform map-reduce on all documents of the nested collection. For that, we have to follow a three-step process briefly explained below.

  • Define the map() function to process every input document. In this function, the this keyword refers to the current document being processed by the map-reduce operation, and the emit() function maps the given values to the keys and returns them.
  • Here, we define the corresponding reduce() function, which is the actual place where aggregation of data takes place. It takes two arguments (keys and values); our code example takes the id and docs.

    Remember that the elements of the docs are returned by the emit() function from the map() method. At this step, the reduce() function reduces the docs array to the sum of its values (elements).

  • Finally, we perform map-reduce on all the documents in the nested collection by using map() and reduce() functions. We use out to save the output in the specified collection, which is map_reduce_output in this case.
Mehvish Ashiq avatar Mehvish Ashiq avatar

Mehvish Ashiq is a former Java Programmer and a Data Science enthusiast who leverages her expertise to help others to learn and grow by creating interesting, useful, and reader-friendly content in Computer Programming, Data Science, and Technology.

LinkedIn GitHub Facebook

Related Article - MongoDB Projection