I’m hoping that after reading the previous blog, you will have a better understanding of indexes and the guidelines for building one.
In this blog, I am going to share index types and their characteristics.
Index Types
Index Properties
Index Types
Let’s drive through the index types prior to version 4.2.
Single Field Index
MongoDB supports single field indexes, A single-field index is used to create an index on the single field of a document.
Regardless of the specified index order, the unique characteristics of a single field index allow traversal in both ascending and descending order.
For example, consider we are having an inventory collection with the following documents.
{ _id: 5, type: “food”, item: “aaa”, ratings: [ 6.5, 8, 9 ] }
{ _id: 6, type: “food”, item: “bbb”, ratings: [ 5, 9.2 ] }
{ _id: 7, type: “food”, item: “ccc”, ratings: [ 9, 1.5, 8 ] }
{ _id: 8, type: “food”, item: “ddd”, ratings: [ 9.9, 5.3 ] }
{ _id: 9, type: “food”, item: “eee”, ratings: [4. 5, 9, 5 ] }
Compound Index
A compound index is a single index that includes multiple fields. A compound index works by storing a subset of a collection’s data in a sorted B-Tree data structure.
A compound index can have a maximum of 32 fields indexed.
For example, consider we are having a people collection with the following documents.
{ name:”Ram”,type: “food”, item: “aaa”, ratings: [ 6.5, 8, 9 ] }
{ name:”Jhonson”, type: “food”, item: “bbb”, ratings: [ 5, 9.2 ] }
{ name:”James”, type: “food”, item: “ccc”, ratings: [ 9, 1.5, 8 ] }
{ name:”Priya”, type: “food”, item: “ddd”, ratings: [ 9.9, 5.3 ] }
{ name:”Kalyan”, type: “food”, item: “eee”, ratings: [4. 5, 9, 5 ] }
This compound index creates a sorted B-Tree structure where records are first stored by name in ascending order. The sorted records then store nested item values in descending order.
Sort Order
Indexes store references to fields in either ascending (1) or descending (-1) sort order. For single-field indexes, the sort order of keys doesn’t matter because MongoDB can traverse the index in either direction. However, for compound indexes, sort order can matter in determining whether the index can support a sort operation.
Multikey Index
MongoDB generates an index key for each element of an array to use when indexing a field that contains an array value. These multikey indexes enable effective array field queries.
Multikey indexes can be built over arrays that contain nested documents as well as scalar variables like characters and numbers.
For example, consider we have an inventory collection with the following documents.
{ _id: 5, type: “food”, item: “aaa”, ratings: [ 6.5, 8, 9 ],category:[Indian] }
{ _id: 6, type: “food”, item: “bbb”, ratings: [ 5, 9.2 ] ,category:[Chinese]}
{ _id: 7, type: “food”, item: “ccc”, ratings: [ 9, 1.5, 8 ],category:[Indian,Chinese]}
{ _id: 8, type: “food”, item: “ddd”, ratings: [ 9.9, 5.3 ],category:[Indian] }
{ _id: 9, type: “food”, item: “eee”, ratings: [4. 5, 9, 5 ],category:[Indian]}
Where ratings and category are the array field.
Limitations of MultiKey Index
Compound Multikey Indexes
1. Compound multikey indexes with multiple array fields are not permitted in MongoDB.
Consider the inventory collection.
Where ratings and category are the array field.
You are not permitted to insert a document that will violate the limitations if the compound multikey index already exists.
2. You are not allowed to specify a multikey index as the shard key index.
3. In MongoDB, hashed indexes are not multikey indexes.
4. The multikey index cannot support the $expr operator.
Geospatial Index
MongoDB’s geospatial indexing allows you to efficiently execute spatial queries on a collection that contains geospatial shapes and points. To showcase the capabilities of geospatial features and compare different approaches.
For example, consider we have restaurants collection with the following documents.
{ name:”Ram”, item: “aaa”,location:[-50.456,9.6789] }
{ name:”Jhonson”, item: “bbb”,location:[-79.7761,8.3456] }
{ name:”James”, item: “ccc”,location:[65.6413,10.3421] }
2dsphere
Queries that compute geometries on an earth-like sphere are supported by 2dsphere indexes.To create a 2dsphere index, use the db.collection.createIndex() method and specify the string literal “2dsphere” as the index type
2d
Queries that compute geometries on a two-dimensional plane are supported by 2d indexes. To create a 2d index, use the db.collection.createIndex() method, specifying the location field as the key and the string literal “2d” as the index type.
Limitations of the Geospatial Index
You cannot use a geospatial index as a shard key when sharding a collection.
Text Indexes
Any field whose value is a string or an array of string elements can be included in text indexes. A collection can only have one text search index, although that index can have several fields.
To drop a text index pass the name of the index to the db.collection.dropIndex(“TextIndexname”) method.
For example, consider we have a stores collection with the following documents.
{ name: “Ram”, description:”luggage storage”,phoneno:7103456421, Area:chennai}
{name: “Jhonson”, description:”Absolutely everything is new”,phoneno:9104356782, Area:chennai}
{ name: “James”, description:”Confirmation of the storage”,phoneno:8978316736, Area: Chennai }
We can also create a compound text index on multiple fields.
We can also create compound indexes, including a mix of text and traditional indexes.
Limitations of Text Indexes
- At most one text index is allowed per collection
- With $text query expression, we cannot use hint()
- Together Text Index and Sort cannot give the required results. The sort operations cannot use the ordering in the text index.
Performance consideration with text indexes
- Text index can be very large and can take a long time to create.
- MongoDB recommends having enough memory on your system to keep the text index in memory otherwise there may be significant IO involved during the search.
Hashed Indexes
To maintain the entries with hashes of the values of the indexed field (mostly _id field in all collections), we use Hash Index. These indexes have a more random distribution of values along their range, but only support equality matches and cannot support range-based queries.
For example, consider we have an inventory collection with the following documents.
{ _id: 5, type: “food”, item: “aaa”, ratings: [ 6.5, 8, 9 ] }
{ _id: 6, type: “food”, item: “bbb”, ratings: [ 5, 9.2 ] }
{ _id: 7, type: “food”, item: “ccc”, ratings: [ 9, 1.5, 8 ] }
{ _id: 8, type: “food”, item: “ddd”, ratings: [ 9.9, 5.3 ] }
{ _id: 9, type: “food”, item: “eee”, ratings: [4. 5, 9, 5 ] }
Hashed indexes tend to be smaller than scalar indexes because only a hash of the key is stored instead of the full key. E.g. For a simple test with 100k documents, we added hashed and scalar indexes on a string field – fieldName. As shown below, the hashed index tends to be considerably smaller than the scalar indexes
“indexSizes” : {
“_id_” : 811008, “firstName_1″ : 441548”, “firstName_hashed” : 1490944
}
Limitations of Hashed Index
- Hashed indexes do not support arrays.
- Hashed indexes cannot be compound indexes.
- You cannot add unique constraints on hashed indexes.
Wildcard Indexes
MongoDB 4.2 introduces wildcard indexes for supporting queries against unknown or arbitrary fields.
A wildcard index is a type of filter that automatically matches any field, sub-document or array in a collection and indexes those matches.
For example, suppose we have a data collection with the following documents.
{ _id: 5, Data:{A:2145,B:740,C:2013}}
{ _id: 5, Data:{A:1405,B:6740,C:213}}
{ _id: 5, Data:{A:1245,B:7140,C:1213}}
{ _id: 5, Data:{A:1465,B:7030,C:21673}}
Queries might be issued against any one of the attributes in the data sub-document. Furthermore, the application may add new fields that we can’t estimate. We need to create a separate index for each field to optimise performance.
db.data.createIndex({_id:1,”Data.A”:1},{background:true})
db.data.createIndex({_id:1,”Data.B”:1},{background:true})
db.data.createIndex({_id:1,”Data.C”:1},{background:true})
It requires creating too many indexes but even this won’t work unless we know for sure if no new fields are added under Data.
In this scenario, the wildcard indexes come to improve the performance.
The following operations create a wildcard index on the Data field.
The statement creates an index on every field in the Data sub-document. Even if new fields are created by an application after the index is created.
Create a Wildcard Index on All Fields
To index the value of all fields in a document (excluding _id
), specify "$**"
as the index key.
With this wildcard index, MongoDB indexes all fields for each document in the collection. If a given field is a nested document or array, the wildcard index recurses into the document/array and stores the value for all fields in the document/array.
Limitations of Wildcard Indexes
- MongoDB cannot use a non-wildcard index to satisfy one part of a query predicate and a wildcard index to satisfy another.
- MongoDB cannot use one wildcard index to satisfy one part of a query predicate and another wildcard index to satisfy another.
- Even if a single wildcard index could support multiple query fields, MongoDB can use the wildcard index to support only one of the query fields. All remaining fields are resolved without an index.
- Wildcard indexes don’t support compound, TTL, Text, Geospatial, Hashed and Unique indexes.
Hidden Indexes
MongoDB 4.4 introduces Hidden indexes.
A hidden index is simply a regular index that is not visible to the query planner. When evaluating the execution plans, MongoDB ignores such kinds of indexes.
Building an index on MongoDB is quite expensive, in particular for large collections or when you don’t have enough available memory. Disabling indexes are useful for testing different execution plans without dropping any index for real. You can hide or unhide an index at any time at no cost to the database.
To hide index feature compatibility version set for 4.4.
For example, consider we have an inventory collection with the following documents.
{ _id: 5, type: “food”, item: “aaa”, ratings: [ 6.5, 8, 9 ] }
{ _id: 6, type: “food”, item: “bbb”, ratings: [ 5, 9.2 ] }
{ _id: 7, type: “food”, item: “ccc”, ratings: [ 9, 1.5, 8 ] }
{ _id: 8, type: “food”, item: “ddd”, ratings: [ 9.9, 5.3 ] }
{ _id: 9, type: “food”, item: “eee”, ratings: [4. 5, 9, 5 ] }
To create the hidden Index:
To Hide an Existing Index:
To unhide the hidden index:
Limitation of Hidden Index
- You cannot hide the _id index.
- During the writes, MongoDB maintains the hidden indexes the same as any regular index.
- The hidden indexes are immediately available after unhidden.
- A unique index provides uniqueness constraint even when hidden.
- A TLL index drops documents even when hidden.
Index Properties
Unique Index
A unique index ensures that the indexed fields do not store duplicate values i.e. enforces uniqueness for the indexed fields. By default, MongoDB creates a unique index on the _id field during the creation of a collection.
Often, you want to ensure that the values of a field are unique across documents in a collection, such as a code or a username.
Use the createIndex() method with the option { unique: true } to create a unique index and compound a unique index.
Unique Partial Indexes
- Partial indexes index only documents in a collection that matches a specific filter expression. If you specify both a partial filter expression and a unique constraint, the unique constraint applies only to documents that match the filter expression.
- Unique Partial indexes introduce in version 3.2.
Limitation of Unique Index
- MongoDB cannot create a unique index on the specified index fields if the collection already contains data that would violate the unique constraint for the index.
- You may not specify a unique constraint on a hashed index.
Partial Index
Partial indexes only index the documents in a collection that meet a specified filter expression. By indexing a subset of the documents in a collection, partial indexes have lower storage requirements and reduced performance costs for index creation and maintenance.
The partialFilterExpression option accepts a document that specifies the filter condition using:
- equality expressions (i.e. field: value or using the $eq operator),
- $exists: true expression,
- $gt, $gte, $lt, $lte expressions,
- $type expressions,
- $and operator at the top-level only
Example:
Limitations of Partial Index
In 5.0 earlier versions of MongoDB, creating multiple partial indexes is not allowed when using the same key pattern with different partialFilterExpressions.
You cannot specify both the partialFilterExpression option and the sparse option.
_id indexes cannot be partial indexes.
Shard key indexes cannot be partial indexes.
Sparse Indexes
The sparse property of an index ensures that the index only contains entries for documents that have the indexed field. The index skips documents that do not have the indexed field.
Example:
Sparse Index and Incomplete Results
If a sparse index would result in an incomplete result set for queries and sort operations, MongoDB will not use that index unless a hint() explicitly specifies the index.
For example, the query { x: { $exists: false } } will not use a sparse index on the x field unless explicitly hinted. See Sparse Index On A Collection Cannot Return Complete Results for an example that details the behaviour.
Changed in version 3.4.
If you include a hint() that specifies a sparse index when you perform a count() of all documents in a collection (i.e. with an empty query predicate), the sparse index is used even if the sparse index results in an incorrect count.
db.collection.insertOne( { _id: 1, y: 1 } )
db.collection.createIndex( { x: 1 }, { sparse: true } )
db.collection.find().hint( { x: 1 } ).count()
To obtain the correct count, do not hint() with a sparse index when performing a count of all documents in a collection.
db.collection.find().count()
db.collection.createIndex( { y: 1 } )
db.collection.find().hint( { y: 1 } ).count()
Comparison Of Partial Indexes with Sparse Indexes
Partial indexes should be preferred over sparse indexes. Partial indexes provide the following benefits:
- Greater control over which documents are indexed.
- A superset of the functionality offered by sparse indexes.
Sparse indexes select documents to index only based on the existence of the indexed field, or for compound indexes, the existence of the indexed fields.
The partial index can also specify filter expressions on fields other than the index key. For example, the following operation creates a partial index, where the index is on the “name” field but the filter expression is on the “email” field.
You are strongly encouraged to consider partial indexes if you have one or more of these use cases.
Hybrid Index Build
- As of MongoDB 4.2, there are no longer foreground/background indexes, just one hybrid index function.
- This is the only index build type in MongoDB.
- We can now build the index quickly and without the need to lock the database.
Comparison to Foreground and Background Builds
- Starting in MongoDB 4.2, index builds obtain an exclusive lock on only the collection being indexed during the start and end of the build process to protect metadata changes.
- The rest of the build process uses the yielding behaviour of background index builds to maximize read-write access to the collection during the build.
- 4.2 index builds still produce efficient index data structures despite the more permissive locking behaviour.
We hope that after reading these blogs you have an idea about the types of indexes and their properties. In upcoming blogs, we will come up with a terminate-in-progress index build. Happy Learning!!!
Boost Your MongoDB Performance! Explore Mydbops' open-source database management solutions and services. We provide expert guidance and tools to streamline your MongoDB operations. Visit our website or contact us today!
{{cta}}