Summary: in this tutorial, you’ll learn how to use MongoDB createIndex()
method to create an index for a field in a collection to speed up queries.
A quick introduction to indexes
Suppose you have a book that contains a list of movies:
To find a movie with the title Pirates of Silicon Valley
, you need to scan every page of the book until you find the match. This is not efficient.
If the book has an index that maps titles with page numbers, you can look up the movie title in the index to find the page number:
Pimpernel' Smith 1 ... Pirates of Silicon Valley 201 ... Twas the Night 300
In this example, the movie with the title Pirates of Silicon Valley
is located on page 201
. Therefore, you can open page 201 to get detailed information about the movie:
In this analogy, the index speeds up the search and makes it more efficient.
The MongoDB index works in a similar way. To speed up a query, you can create an index for a field of a collection.
However, when you insert, update, or delete the documents from the collection, MongoDB needs to update the index accordingly.
In other words, an index improves the speed of document retrieval at the cost of additional write and storage space to maintain the index data structure. Internally, MongoDB uses the B-tree structure to store the index.
Load sample data
We’ll use the movies
collection from the mflix
sample database to demonstrate how the indexes work in MongoDB.
First, download the movies.json
file and place it in a folder on your computer e.g., c:\data\movies.json
Second, import the movies.json
file into the mflix
database using the mongoimport
tool:
mongoimport c:\data\movies.json -d mflix -c movies
Code language: CSS (css)
List indexes of a collection
By default, all collections have an index on the _id
field. To list the indexes of a collection, you use the getIndexes()
method with the following syntax:
db.collection.getIndexes()
Code language: CSS (css)
In this syntax, the collection
is the name of the collection that you want to get the indexes. For example, the following shows the indexes of the movies
collection in the mflix
database:
db.sales.getIndexes()
Code language: CSS (css)
Output:
[ { v: 2, key: { _id: 1 }, name: '_id_' } ]
Code language: CSS (css)
The output shows the index name '_id_'
and index key _id
. The value 1 in the key : { _id : 1 }
indicates the ascending order of the _id
values in the index.
When an index contains one field, it’s called a single field index. However, if an index holds references to multiple fields, it is called a compound index. This tutorial focuses on a single field index.
Explain a query plan
The following query finds the movie with the title Pirates of Silicon Valley
:
db.movies.find({ title: 'Pirates of Silicon Valley })
To find the movie, MongoDB has to scan the movies
collection to find the match.
Before executing a query, the MongoDB query optimizer comes up with one or more query execution plans and selects the most efficient one.
To get the information and execution statistics of query plans, you can use the explain()
method:
db.collection.explain()
Code language: CSS (css)
For example, the following returns the query plans and execution statistics for the query that finds the movies with the title Pirates of Silicon Valley
:
db.movies.find({
title: 'Pirates of Silicon Valley'
}).explain('executionStats')
Code language: JavaScript (javascript)
The explain()
method returns a lot of information. And you should pay attention to the following winningPlan
:
...
winningPlan: {
stage: 'COLLSCAN',
filter: { title: { '$eq': 'Pirates of Silicon Valley' } },
direction: 'forward'
},
...
Code language: JavaScript (javascript)
The winningPlan
returns the information on the plan that the query optimizer came up with. In this example, the query planner comes up with the COLLSCAN
that stands for the collection scan.
Also, the executionStats
shows that the result contains one document and the execution time is 9 milliseconds:
...
executionStats: {
executionSuccess: true,
nReturned: 1,
executionTimeMillis: 9,
totalKeysExamined: 0,
totalDocsExamined: 23539,
...
Code language: JavaScript (javascript)
Create an index for a field in a collection
To create an index for the title
field, you use the createIndex()
method as follows:
db.movies.createIndex({title:1})
Code language: CSS (css)
Output:
title_1
In this example, we pass a document to the createIndex()
method. The { title: 1}
document contains the field and value pair where:
- The field is the index key (
year
). - The value describes the type of index for the
year
field. The value1
for descending index and-1
for ascending index.
The createIndex()
method returns the index name. In this example, it returns the title_1
which is the concatenation of the field and value.
The following query shows the indexes of the movies
collection:
db.movies.getIndexes()
Code language: CSS (css)
Output:
[
{ v: 2, key: { _id: 1 }, name: '_id_' },
{ v: 2, key: { title: 1 }, name: 'title_1' }
]
Code language: JavaScript (javascript)
The output shows two indexes, one is the default index and another is the year_1
index that we have created.
By default, MongoDB names an index by concatenating the indexed keys and each key’s direction in the index ( i.e. 1 or -1) using underscores as a separator. For example, an index created on { title: 1 }
has the name title_1
.
The following returns the query plans and execution statistics for the query that finds the movie with the title Pirates of Silicon Valley
:
db.movies.find({
title: 'Pirates of Silicon Valley'
})
Code language: CSS (css)
This time the query optimizer uses the index scan (IXSCAN
) instead of the collection scan (COLLSCAN
):
...
winningPlan: {
stage: 'FETCH',
inputStage: {
stage: 'IXSCAN',
keyPattern: { title: 1 },
indexName: 'title_1',
isMultiKey: false,
multiKeyPaths: { title: [] },
isUnique: false,
isSparse: false,
isPartial: false,
indexVersion: 2,
direction: 'forward',
indexBounds: {
title: [
'["Pirates of Silicon Valley", "Pirates of Silicon Valley"]'
]
}
}
},
...
Code language: JavaScript (javascript)
Also, the execution time (executionTimeMillis
) was down to almost zero from 2
milliseconds:
...
executionStats: {
executionSuccess: true,
nReturned: 1,
executionTimeMillis: 0,
totalKeysExamined: 1,
totalDocsExamined: 1,
...
Code language: JavaScript (javascript)
Summary
- An index improves the speed of document retrieval at the cost of additional write and storage space to maintain its data structure.
- Use the
createIndex()
method to create an index for a field in a collection. - Use the
getIndexes()
method to list the indexes of a collection. - Use the
explain()
method to get the information and execution statistics of query plans.