← back

Index
Types

FILE  24_index_types
TOPIC  Single Field · Multikey · Text · Hashed · TTL · Type Reference
LEVEL  Foundation
01
Single Field Index
Foundational index — one field, any direction
basic

A single field index is the most common index type. It creates a B-Tree on one field of a document (top-level or nested). Direction (1 ascending, -1 descending) is functionally irrelevant for single field indexes — MongoDB traverses the B-Tree in either direction equally.

// Ascending index on email (direction irrelevant for single field)
db.users.createIndex({ email: 1 }, { name: "idx_email" })

// Index on a nested embedded field
db.users.createIndex({ "address.city": 1 }, { name: "idx_city" })

// Unique email constraint
db.users.createIndex({ email: 1 }, { unique: true, name: "idx_email_unique" })

What Single Field Indexes Provide

  • Equality{ email: "x@y.com" } → IXSCAN
  • Range{ age: { $gte: 18, $lt: 65 } } → IXSCAN
  • Sort elimination — if sorting by the indexed field, no in-memory sort needed
  • Covered queries — combined with projection limiting output to indexed fields
NOTE
MongoDB automatically creates a single field index on _id for every collection. It is unique, ascending, and cannot be dropped or modified.
02
Multikey Index (Array Fields)
One index entry per array element — automatic detection
arrays

When you create an index on a field that contains an array, MongoDB automatically creates a multikey index — storing one separate index entry per array element. This enables efficient queries that search within array fields.

// MongoDB detects the array and builds multikey automatically
db.products.createIndex({ tags: 1 })
// Document: { _id: 1, tags: ["mongodb", "nosql", "database"] }
// Generates 3 index entries: "mongodb"→doc, "nosql"→doc, "database"→doc

// Index on array of embedded documents
db.orders.createIndex({ "items.productId": 1 })

// Query uses the multikey index just like a regular index
db.products.find({ tags: "mongodb" })   // IXSCAN on multikey
db.products.find({ tags: { $in: ["mongodb", "nosql"] } })  // also IXSCAN

Multikey Constraints

DANGER
A compound index cannot contain more than one array field. Attempting createIndex({ tags: 1, categories: 1 }) where both are arrays fails with: "cannot index parallel arrays". Create separate single-field indexes for each array field instead.
ConstraintDetail
Max array fields in compound index1 — two array fields = error
Covered queriesNot supported — document fetch always required
Shard keyCannot use a multikey index as a shard key
Write overheadEach array element = one B-Tree entry — large arrays = many writes

Array Size and Index Performance

// A document with a 1000-element array generates 1000 index entries
// Average array size directly impacts index size and write latency
// Monitor: db.collection.stats().indexSizes

// For very large arrays consider $elemMatch queries and partial indexes
// instead of indexing the entire array field
03
Text Index
Full-text keyword search with stemming and relevance scoring
search

A text index tokenizes string fields into indexed terms, applies language-specific stemming, and filters stop words. It enables full-text keyword and phrase search via the $text operator.

// Create a text index on multiple fields with weights
db.articles.createIndex(
  { title: "text", body: "text", tags: "text" },
  {
    weights:          { title: 10, tags: 5, body: 1 },
    default_language: "english",
    name:             "idx_article_text"
  }
)
// title matches are weighted 10x over body matches in relevance scoring

Querying a Text Index

// Keyword search — "mongodb" OR "indexing" (OR logic between words)
db.articles.find({ $text: { $search: "mongodb indexing" } })

// Exact phrase (double-quoted within the string)
db.articles.find({ $text: { $search: "\"aggregation pipeline\"" } })

// Exclude a term
db.articles.find({ $text: { $search: "mongodb -nosql" } })

// Sort by relevance score
db.articles.find(
  { $text: { $search: "indexing" } },
  { score: { $meta: "textScore" }, title: 1, _id: 0 }
).sort({ score: { $meta: "textScore" } })

Text Search Behavior

FeatureBehavior
Stemming"running" matches "run", "runs", "runner"
Case sensitivityCase-insensitive by default
Diacritics"café" matches "cafe"
Stop words"the", "is", "and" ignored automatically
Word logicSpace-separated words use OR; use quotes for AND (phrase)
WARN
Only one text index per collection is allowed. All text-searchable fields must be combined into this single index. Creating a second text index on the same collection fails. Also: $text must be in $match at the first pipeline stage in aggregation.

Text Index Limitations

  • Cannot use range queries or standard sort operations
  • Significantly larger storage footprint than B-Tree indexes
  • Text indexes slow down writes more than normal indexes (many term entries per document)
  • For production search at scale, consider Atlas Search (Lucene-based) instead
04
Hashed Index
Even shard distribution — no range queries
sharding

A hashed index stores the hash of a field's value rather than the value itself. Its primary use is as a shard key to ensure even data distribution across shards, preventing hotspots caused by monotonically increasing fields (ObjectIds, timestamps).

// Create a hashed index
db.users.createIndex({ userId: "hashed" }, { name: "idx_userid_hashed" })

// Use as shard key for even distribution
sh.shardCollection("mydb.users", { userId: "hashed" })
// Documents are distributed based on hash(userId), not userId value
// Result: even spread across shards even for sequential IDs
DANGER
Hashed indexes cannot be used for range queries ($gt, $lt, $gte, $lte) or sort operations because hashing destroys sorted order. A range query on a hashed-indexed field falls back to a COLLSCAN. Only use hashed indexes for shard keys on equality-accessed fields.

Hashed Index Edge Cases

ScenarioBehavior
Floating-point valuesTruncated to 64-bit integer before hashing — 2.3 and 2.9 hash identically
Range query on hashed fieldCOLLSCAN fallback — index not used
Sort on hashed fieldIn-memory sort required — index not used
Compound with another hashed fieldNot supported

Hashed vs Range Shard Key

// Range shard key — sequential inserts all go to the same shard (hotspot)
sh.shardCollection("db.events", { timestamp: 1 })
// ↑ New documents always land on the "last" shard — uneven load

// Hashed shard key — sequential inserts distributed evenly
sh.shardCollection("db.events", { timestamp: "hashed" })
// ↑ Each insert lands on a random shard — balanced writes
05
TTL Index
Auto-delete documents after a time period — up to 60s lag
expiry

A TTL (Time To Live) index is a single-field index on a BSON Date field that instructs MongoDB to automatically delete documents after a specified number of seconds has elapsed past the stored date value.

// Delete sessions 1 hour after createdAt
db.sessions.createIndex(
  { createdAt: 1 },
  { expireAfterSeconds: 3600, name: "idx_session_ttl" }
)

// Delete OTP tokens 5 minutes after generation
db.otpTokens.createIndex(
  { generatedAt: 1 },
  { expireAfterSeconds: 300, name: "idx_otp_ttl" }
)

// Delete at an exact stored datetime
// expireAfterSeconds: 0 → delete when current time reaches the stored date
db.jobs.createIndex(
  { expiresAt: 1 },
  { expireAfterSeconds: 0, name: "idx_job_expiry" }
)
// { expiresAt: ISODate("2024-12-31T23:59:59Z") }
// Document is deleted at/after that exact timestamp

How TTL Deletion Works

A background thread called TTLMonitor runs approximately every 60 seconds and removes documents where the date field value + expireAfterSeconds <= current time.

WARN
TTL deletion is not instantaneous. There is a lag of up to 60+ seconds between a document becoming eligible for expiry and its actual removal. Do not rely on TTL for time-sensitive enforcement in application logic — check expiry in your code as well.

TTL Edge Cases

ScenarioBehavior
Field contains an array of datesDocument expires based on the earliest date in the array
Field is missing from documentDocument never expires
Field is not a BSON Date type (e.g., string timestamp)Document never expires
Replica set secondaryTTL deletions only occur on the primary; secondaries replicate the deletes
Capped collectionTTL indexes are not supported
Under heavy write loadTTLMonitor may fall behind — lag can exceed 60 seconds
06
Wildcard Index
Index all fields in dynamic/unknown schemas
dynamic

A wildcard index indexes every field in a document (or every field under a sub-path). It is designed for collections with highly variable or unknown schemas where queries filter on arbitrary fields.

// Index ALL fields in every document
db.collection.createIndex({ "$**": 1 })

// Index all fields under a specific sub-document path
db.products.createIndex({ "metadata.$**": 1 })

// Queries on any field under metadata now use the index:
db.products.find({ "metadata.color": "red" })
db.products.find({ "metadata.size.width": { $gt: 10 } })
WARN
Wildcard indexes are large and expensive to maintain. Every field in every document generates an index entry. Use them only when queries are truly unpredictable — for known query patterns, a targeted compound index is always more efficient.

Wildcard Index Constraints

  • Cannot be used as a shard key
  • Cannot be compound (only one field spec)
  • Cannot support covered queries
  • Significantly larger than targeted indexes
07
Index Type Reference
All types at a glance
reference
Index TypePrimary UseRange QueriesKey Constraint
Single FieldBasic filtering and sortingYesDirection irrelevant for single fields
CompoundMulti-field filter + sortYes (prefix)Field order critical; max 32 fields
MultikeyIndexing array fieldsYesOnly one array field per compound index
TextFull-text keyword searchNoOne per collection; stemming + stop words
HashedShard key for even distributionNoNo range/sort; float truncate issue
TTLAuto-expiry of documentsNot for expiryMust be a BSON Date field; ~60s lag
SparseOptional fieldsYesMissing docs excluded; no null queries
PartialIndex a document subsetYesQuery must satisfy partialFilterExpression
WildcardDynamic/unknown schemasYesLarge; no shard key; no covered queries