Case Insensitive Queries in MongoDB

Tahseen Tauseef Apr 28, 2022
  1. Case Insensitive Queries in MongoDB
  2. Improve Case-Insensitive Regex Queries
  3. Use Regex in the find() Method for Case Insensitive Search in MongoDB
Case Insensitive Queries in MongoDB

In this article, case insensitive queries are discussed in brief detail. Moreover, case insensitive search queries are also explained in detail.

This article discusses the following topics.

  1. Case insensitive queries
  2. Improve case insensitive regex queries
  3. Use regex in the find() method for case insensitive search

Case Insensitive Queries in MongoDB

Case insensitive indexes enable searches that compare strings without regard for the case.

With db.collection.createIndex() you may establish a case-insensitive index by including the collation parameter as an optional parameter.

db.collection.createIndex( { "key" : 1 },
                           { collation: {
                               locale : <locale>,
                               strength : <strength>
                             }
                           } )

Include the following when specifying a collation for a case-sensitive index.

  1. locale - specifies language rules.
  2. strength - used to determine comparison rules. A 1 or 2 value will indicate a case-insensitive collation.

Behavior:

Using a case insensitive index does not affect query results; however, it can improve speed.

To utilize a collation-specified index, query and sort operations must use the same collation as the index. If a collection defines a collation, all queries and indexes that use that collection inherit it unless they specify a different collation.

Create a Case Insensitive Index

Create an index with a collation and specify the strength option to 1 or 2 to utilize a case-insensitive index on the collection with no default collation. To utilize the index-level collation, you must provide the same collation at the query level.

The following example generates a collection with no default collation and adds an index with a case insensitive collation to the type column.

db.createCollection("fruit")

db.fruit.createIndex( { type: 1},
                      { collation: { locale: `en`, strength: 2 } } )

Queries must have the same collation to use the index.

db.fruit.insertMany( [
   { type: "bloctak" },
   { type: "Bloctak" },
   { type: "BLOCTAK" }
] )

db.fruit.find( { type: "bloctak" } )
//not use index, finds one result

db.fruit.find( { type: "bloctak" } ).collation( { locale: `en`, strength: 2 } )
// uses index, and will find three results

db.fruit.find( { type: "bloctak" } ).collation( { locale: `en`, strength: 1 } )
//not uses the index, finds three results

Case Insensitive Indexes on Collections with the Default Collation

If you establish a collection with a default collation, all future indexes inherit that collation unless you provide a different collation. All queries that do not specify a collation inherit the default collation.

The example below generates a names collection with a default collation and then indexes on the first_name field.

db.createCollection("names", { collation: { locale: `en_US`, strength: 2 } } )

db.names.createIndex( { first_name: 1 } ) // inherits the default collation

Insert a small collection of names:

db.names.insertMany( [
   { first_name: "Betsy" },
   { first_name: "BETSY"},
   { first_name: "betsy"}
] )

Queries on this collection, by default, utilize the provided collation and, if possible, the index.

db.names.find( { first_name: "betsy" } )
// inherits the default collation: { collation: { locale: `en_US`, strength: 2 } }
// finds three results

The preceding procedure discovers all three documents using the collection’s default collation. It employs an index on the first_name field for improved efficiency.

This collection may still perform case-sensitive searches by specifying a different collation in the query.

db.names.find( { first_name: "betsy" } ).collation( { locale: `en_US` } )
// not use the collection`s default collation, finds one result

The preceding procedure returns only one document because it utilizes a collation with no strength value supplied. It does not utilize the index or the collection’s default collation.

Improve Case-Insensitive Regex Queries

If you often do case-insensitive regex queries (with the I option), you should establish a case-insensitive index to accommodate your searches.

A collation on an index can be used to provide language-specific string comparison rules, such as letter-case and accent mark rules. A case-insensitive index improves performance for case-insensitive queries significantly.

Consider the following documents in an employees collection. Aside from the usual _id index, this collection contains no other indexes.

db={
  "employees": [
    {
      "_id": 1,
      "first_name": "Hannah",
      "last_name": "Simmons",
      "dept": "Engineering"
    },
    {
      "_id": 2,
      "first_name": "Michael",
      "last_name": "Hughes",
      "dept": "Security"
    },
    {
      "_id": 3,
      "first_name": "Wendy",
      "last_name": "Crawford",
      "dept": "Human Resources"
    },
    {
      "_id": 4,
      "first_name": "MICHAEL",
      "last_name": "FLORES",
      "dept": "Sales"
    }
  ]
}

If your application often searches the first_name column, you might wish to use case-insensitive regex queries to discover matched names.

Case-insensitive regex also aids in matching against data formats that differ, as in the example above, where you have the first_name of both Michael and MICHAEL.

If a user searches for michael, the program may execute the following query.

db.employees.find({
  first_name: {
    $regex: "michael",
    $options: "i"
  }
})

Because this query contains the $regex:

{ "_id" : 2, "first_name" : "Michael", "last_name" : "Hughes", "dept" : "Security" }
{ "_id" : 4, "first_name" : "MICHAEL", "last_name" : "FLORES", "dept" : "Sales" }

Although this query returns the desired documents, case-insensitive regex queries without index support are slow. You may boost efficiency by creating a case-insensitive index on the first_name field.

db.employees.createIndex(
  { first_name: 1 },
  { collation: { locale: 'en', strength: 2 } }
)

When the strength field of an index’s collation document is set to 1 or 2, the index is case-insensitive for a more extensive explanation of the collation document and the various strength values.

For the application to use this index, you must also mention the index’s collation document in the query. Remove the $regex operator from the previous db.collection.find() function and use the newly constructed index instead.

db.employees.find( { first_name: "michael" } ).collation( { locale: 'en', strength: 2 } )

Do not utilize the $regex operator when utilizing a case-insensitive index for your query. The $regex implementation does not support collation and cannot use case-insensitive indexes.

Use Regex in the find() Method for Case Insensitive Search in MongoDB

Use regex in the find() method for case insensitive search.

Syntax:

db.demo572.find({"yourFieldName" : { `$regex`:/^yourValue$/i}});

To understand the above syntax, let’s create a collection of documents.

> db.demo572.insertOne({"CountryName":"US"});{
   "acknowledged" : true, "insertedId" : ObjectId("5e915f0e581e9acd78b427f1")
}
> db.demo572.insertOne({"CountryName":"UK"});{
   "acknowledged" : true, "insertedId" : ObjectId("5e915f17581e9acd78b427f2")
}
> db.demo572.insertOne({"CountryName":"Us"});{
   "acknowledged" : true, "insertedId" : ObjectId("5e915f1b581e9acd78b427f3")
}
> db.demo572.insertOne({"CountryName":"AUS"});{
   "acknowledged" : true, "insertedId" : ObjectId("5e915f20581e9acd78b427f4")
}
> db.demo572.insertOne({"CountryName":"us"});{
   "acknowledged" : true, "insertedId" : ObjectId("5e915f25581e9acd78b427f5")
}

The find() function displays all documents in a collection.

db.demo572.find();

This will produce the following output.

Case Insensitive Search in MongoDB

The following is the query for case insensitive search.

> db.demo572.find({"CountryName" : { `$regex`:/^US$/i}});

This will produce the following output.

Case Insensitive Search in MongoDB 2

Related Article - MongoDB Query