How to Fuzzy Search in MongoDB

Mehvish Ashiq Feb 16, 2024
  1. What Is Fuzzy Search
  2. Create a Sample Collection in MongoDB
  3. Use the $regex Operator to Perform Fuzzy Search in MongoDB
  4. Use the $text Query to Perform Fuzzy Search in MongoDB
  5. Use JavaScript’s Fuse.js Library to Perform Fuzzy Search in MongoDB
How to Fuzzy Search in MongoDB

Today, we will discuss fuzzy search and how we can do a fuzzy search using MongoDB.

We will start by using the $regex operator and $text query. Further, we will move towards learning the use of a JavaScript library named Fuse.js to do a fuzzy search on the documents.

Using fuzzy search, we can search a text that does not match exactly but matches the term closely. It is useful to find relevant results even when the search terms are misspelled.

For instance, Google shows us various web pages relevant to our searched term even when mistyped. The use of regular expressions (also called regex) is also a very beneficial and time-saving approach for implementing a fuzzy search.

Create a Sample Collection in MongoDB

We will start from basic to advance levels to learn fuzzy search. To practice it, let’s create a sample collection named collection_one that has one field for every document, which is the name.

The _id is automatically created; we don’t have to create that. You can use the following queries to do the same.

Example Code:

> db.createCollection('collection_one')
> db.collection_one.insertMany([
    { name : 'Mehvish Ashiq'},
    { name : 'Jennifer Johnson'},
    { name : 'Natalie Robinson'},
    { name : 'John Ferguson'},
    { name : 'Samuel Patterson'},
    { name : 'Salvatore Callahan'},
    { name : 'Mikaela Christensen'}
])
> db.collection_one.find()

OUTPUT:

{ "_id" : ObjectId("62939a37b3a0d806d251ddae"), "name" : "Mehvish Ashiq" }
{ "_id" : ObjectId("62939a37b3a0d806d251ddaf"), "name" : "Jennifer Johnson" }
{ "_id" : ObjectId("62939a37b3a0d806d251ddb0"), "name" : "Natalie Robinson" }
{ "_id" : ObjectId("62939a37b3a0d806d251ddb1"), "name" : "John Ferguson" }
{ "_id" : ObjectId("62939a37b3a0d806d251ddb2"), "name" : "Samuel Patterson" }
{ "_id" : ObjectId("62939a37b3a0d806d251ddb3"), "name" : "Salvatore Callahan" }
{ "_id" : ObjectId("62939a37b3a0d806d251ddb4"), "name" : "Mikaela Christensen" }

Use the $regex Operator to Perform Fuzzy Search in MongoDB

Example Code:

> db.collection_one.find({"name": /m/})

OUTPUT:

{ "_id" : ObjectId("62939a37b3a0d806d251ddb2"), "name" : "Samuel Patterson" }

In this code, we performed a fuzzy search on the name field and retrieved all documents where the name field contains the letter m.

As you can see, we only got one record containing the m letter, but there are two more documents that start with M (capital letter). To handle this, we can use the i modifier as follows, which performs the case-insensitive search.

Example Code:

> db.collection_one.find({"name": /m/i})

OUTPUT:

{ "_id" : ObjectId("62939a37b3a0d806d251ddae"), "name" : "Mehvish Ashiq" }
{ "_id" : ObjectId("62939a37b3a0d806d251ddb2"), "name" : "Samuel Patterson" }
{ "_id" : ObjectId("62939a37b3a0d806d251ddb4"), "name" : "Mikaela Christensen" }

It showed that having a correctly designed regular expression is very important; otherwise, we may get misleading results. We can do the same in the following way as well.

Example Code (case-insensitive search):

> db.collection_one.find({'name': {'$regex': 'm','$options': 'i'}})

OUTPUT:

{ "_id" : ObjectId("62939a37b3a0d806d251ddae"), "name" : "Mehvish Ashiq" }
{ "_id" : ObjectId("62939a37b3a0d806d251ddb2"), "name" : "Samuel Patterson" }
{ "_id" : ObjectId("62939a37b3a0d806d251ddb4"), "name" : "Mikaela Christensen" }

Similarly, we can get all the documents where the name ends at a combination of two letters as on.

Example Code:

> db.collection_one.find({name:{'$regex' : 'on$', '$options' : 'i'}})

OUTPUT:

{ "_id" : ObjectId("62939a37b3a0d806d251ddaf"), "name" : "Jennifer Johnson" }
{ "_id" : ObjectId("62939a37b3a0d806d251ddb0"), "name" : "Natalie Robinson" }
{ "_id" : ObjectId("62939a37b3a0d806d251ddb1"), "name" : "John Ferguson" }
{ "_id" : ObjectId("62939a37b3a0d806d251ddb2"), "name" : "Samuel Patterson" }

Use the $text Query to Perform Fuzzy Search in MongoDB

The $text query will not work on our sample collection named collection_one because that does not has the text index. So, we create the index as follows.

Example Code:

> db.collection_one.createIndex({name:"text"});

The above statement will also create the specified collection if it does not exist already. Remember that we can create an index on one or multiple fields separated by a comma.

See the following example.

db.collection_name.createIndex({name:"text", description:"text"});

Once the index is created, we can do a fuzzy search as given below.

Example Code:

> db.collection_one.find({ $text: { $search: "Mehvish" } } )

OUTPUT:

{ "_id" : ObjectId("62939a37b3a0d806d251ddae"), "name" : "Mehvish Ashiq" }

Use JavaScript’s Fuse.js Library to Perform Fuzzy Search in MongoDB

Example Code (the fuzzysearch.js file code):

const Fuse = require('fuse.js')
var MongoClient = require('mongodb').MongoClient;
var url = 'mongodb://localhost:27017/';

MongoClient.connect(url, function(err, db) {
  if (err) throw err;
  var dbo = db.db('FuseFuzzySearch');

  var personObj = [
    {name: 'Mehvish Ashiq'}, {name: 'Jennifer Johnson'},
    {name: 'Natalie Robinson'}, {name: 'John Ferguson'},
    {name: 'Samuel Patterson'}, {name: 'Salvatore Callahan'},
    {name: 'Mikaela Christensen'}
  ];

  dbo.collection('person').insertMany(personObj, function(err, res) {
    if (err) throw err;
  });

  const options = {includeScore: true, keys: ['name']}

  const fuse = new Fuse(personObj, options);
  const result = fuse.search('jahson');
  console.log(result);
  db.close();
});

OUTPUT:

[
  {
    item: { name: 'Jennifer Johnson', _id: 6293aa0340aa3b21483d9885 },
    refIndex: 1,
    score: 0.5445835311565898
  },
  {
    item: { name: 'John Ferguson', _id: 6293aa0340aa3b21483d9887 },
    refIndex: 3,
    score: 0.612592665952338
  },
  {
    item: { name: 'Natalie Robinson', _id: 6293aa0340aa3b21483d9886 },
    refIndex: 2,
    score: 0.6968718698752637
  },
  {
    item: { name: 'Samuel Patterson', _id: 6293aa0340aa3b21483d9888 },
    refIndex: 4,
    score: 0.6968718698752637
  }
]

In this code example, we first imported the fuse.js library. Next, we connected to MongoDB.

If it is not connected for any reason, then throw an error. Otherwise, create a database named FuseFussySearch.

Then, create an object named personObj containing all the documents we want to insert into the person collection. An error will be generated if there is any issue while inserting the data.

Create the object of Fuse, pass the array of objects personObj and options having keys and includeScore to perform the fuzzy search and get the results, as given above.

Here, the keys specify the fields on which the search will be performed. The includeScore is optional, but better to have it because it tells the matching score.

If it is 0, the program finds the perfect match, while a score of 1 shows the complete mismatch. You can find all the options here.

Finally, do not forget to close the connection. There are many other libraries that you can also explore.

Mehvish Ashiq avatar Mehvish Ashiq avatar

Mehvish Ashiq is a former Java Programmer and a Data Science enthusiast who leverages her expertise to help others to learn and grow by creating interesting, useful, and reader-friendly content in Computer Programming, Data Science, and Technology.

LinkedIn GitHub Facebook