This tutorial describes the default maximum size limit for storing a document in MongoDB. It also educates the alternate solution if the data exceeds the size limit.
We will also learn about the efficient use of the default maximum size limit for a BSON document.
MongoDB Maximum Document Size
In MongoDB, the documents (objects) are stored in BSON format. The BSON (the
Binary JSON) is a binary serialization of the JSON-like documents.
Using this format, we can use different extensions to use the various representation of data types that are not a part of the JSON.
For instance, we have a
BinData type in BSON that are not available in JSON. According to the MongoDB documentation, the size limit for a single BSON document is
We have the maximum size limit of a document to ensure that one document can’t use the unrestricted amount of RAM or bandwidth during transmission. Remember that we can nest the BSON documents up to 100 levels where each array/object adds one level.
In today’s world, we have data all around us. So, there is a possibility that our data may increase the size limit for a BSON document which is 16 megabytes.
In that case, MongoDB assists us by providing the
GridFS API to store the documents larger than
What Is the
GridFS is a MongoDB specification that we can use to store and access the large files exceeding the limit of BSON document (
16MB), for instance, audio, video, or image files. It is similar to the file system for storing files, but the data is stored in MongoDB collections.
GridFS API divides the file into chunks and stores every data chunk in a separate document where each document’s size is
GridFS contains two collections,
fs.chunks by default, storing a file’s metadata and chunks.
Every chunk is recognized by a unique
ObjectId) field, while the
fs.files serve as a parent document. The
files_id field in the
fs.chunks document links the chunk to its parent.
You can go through this article to understand the syntax while using
Use Default BSON Document Size Limit Efficiently
The BSON document size limit (
16MB) is a lot. For instance, the whole uncompressed text of the
War of the Worlds is only
364k (HTML), but exceptions are always there.
If your data exceeds the limit, you can use the
GridFS API that we discussed earlier or make a strategy for efficient use of
Let’s have a scenario where we want to develop an XYZ application. The application needs four data types —
dates (represented as UNIX ms).
16MB size limit, MongoDB can easily store around two million values of
64-bit numbers (
Booleans as well).
string type values need special attention because every UTF-8 character occupies one
byte. We need to optimize the size of all the columns containing
string type values.
We can try the following ways to decrease the size of a column having
string type values.
- We can use the
We can create a dictionary and insert all unique
stringtype values into the dictionary. Then, replace the string values with indexes.
This approach is useful if we have many repeated string values in a field. This method will not help if someone wants to store a column of hashes, but they can use the
We can also split the column into various chunks and save these chunks in some other documents linked to the main document.
There is a reference article demonstrating all these approaches.