26 · Index Properties

01

Unique Index

Enforce uniqueness — E11000 on duplicate insert

unique

A unique index rejects any write that would create a duplicate value for the indexed field(s). MongoDB enforces this constraint at the storage layer — no application-level duplicate check is needed.

// Single field unique index
db.users.createIndex({ email: 1 }, { unique: true, name: "idx_email_unique" })

// Compound unique index — the COMBINATION must be unique
// (userId, date) can repeat userId, repeat date — but not the same (userId, date) pair
db.checkins.createIndex(
  { userId: 1, date: 1 },
  { unique: true, name: "idx_user_date_unique" }
)

// Duplicate insert → WriteError E11000
// E11000 duplicate key error collection: mydb.users index: idx_email_unique
// dup key: { : "user@example.com" }

Null and Missing Fields with Unique Indexes

DANGER

A standard unique index treats null as a value. If the indexed field is missing or null in multiple documents, they all have null as their index key — which violates uniqueness. You can only have one document with a missing/null value in a standard unique index. Use unique + sparse or unique + partial to allow multiple documents with missing values.

// Standard unique index on email — only ONE null allowed
db.users.createIndex({ email: 1 }, { unique: true })

db.users.insertOne({ name: "Alice" })             // ✅ no email — null
db.users.insertOne({ name: "Bob" })               // ❌ E11000 — second null

// Fix: unique + sparse — excludes missing/null docs from index entirely
db.users.createIndex({ email: 1 }, { unique: true, sparse: true })
db.users.insertOne({ name: "Alice" })             // ✅ excluded from index
db.users.insertOne({ name: "Bob" })               // ✅ excluded from index
db.users.insertOne({ name: "Carol", email: "c@x.com" }) // ✅ in index, unique
db.users.insertOne({ name: "Dave",  email: "c@x.com" }) // ❌ E11000

Unique Index Edge Cases

Scenario	Behavior
Inserting duplicate value	WriteError E11000
Creating unique index on existing collection with duplicates	Index creation fails — must resolve duplicates first
Two documents with missing field (standard unique)	Second insert fails — both have null key
Unique + sparse with missing field	Multiple missing-field docs allowed — excluded from index
Case sensitivity	"User@example.com" ≠ "user@example.com" — unique treats them as distinct
Sharded collection unique index	Unique only if shard key is a prefix of the unique index key

02

Sparse Index

Index only documents where the field exists — excludes null/missing

sparse

A sparse index only includes documents where the indexed field exists and is not null. Documents that are missing the field are excluded from the index entirely. This reduces index size when the field is optional and present in only a subset of documents.

// Standard index — all documents indexed (missing field stored as null)
db.users.createIndex({ linkedInUrl: 1 })
// Every user document indexed, even those without linkedInUrl

// Sparse index — only users WITH linkedInUrl are indexed
db.users.createIndex({ linkedInUrl: 1 }, { sparse: true, name: "idx_linkedin_sparse" })
// Documents without linkedInUrl are NOT in the index — smaller index size

The Critical Sparse Index Gotcha

DANGER

Querying for null or missing values on a sparse-indexed field falls back to COLLSCAN. Sparse indexes exclude documents with null/missing fields, so they cannot satisfy queries that specifically look for null or missing. This is the most common sparse index pitfall.

// Sparse index on linkedInUrl
db.users.createIndex({ linkedInUrl: 1 }, { sparse: true })

// ✅ Uses sparse index — finds existing linkedInUrl values
db.users.find({ linkedInUrl: "https://linkedin.com/in/alice" })

// ❌ COLLSCAN — querying for null/missing falls back (index excludes these docs)
db.users.find({ linkedInUrl: null })
db.users.find({ linkedInUrl: { $exists: false } })

// ❌ COLLSCAN — sort using sparse index when collection has docs without the field
// MongoDB cannot use sparse index for sort if it would exclude docs from results
db.users.find({}).sort({ linkedInUrl: 1 })

Standard vs Sparse Index Behavior

Scenario	Standard Index	Sparse Index
Document missing indexed field	Indexed as null	Excluded from index
Query `{ field: null }`	IXSCAN	COLLSCAN (excluded)
Query `{ field: { $exists: false } }`	IXSCAN	COLLSCAN
Query `{ field: "value" }`	IXSCAN	IXSCAN
Sort on field (all docs)	Index can serve sort	COLLSCAN (excluded docs)
Index size	Full collection size	Only docs with field

TIP

Use sparse indexes for optional fields that are present in less than ~20% of documents. A good example: an ssoToken field that only exists for enterprise accounts. The index stays small and only serves queries for SSO users — the common case query for non-SSO users won't be affected because they don't have the field at all.

03

Partial Index

Index only documents matching a filter expression — precise subset indexing

partial

A partial index uses a partialFilterExpression to index only the subset of documents that satisfy the filter. Unlike sparse (which only filters on field existence), partial can filter using any supported operator — enabling targeted indexes for high-frequency query patterns.

// Only index active users — queries on inactive users fall back to COLLSCAN
db.users.createIndex(
  { email: 1 },
  {
    partialFilterExpression: { isActive: true },
    name: "idx_email_active_partial"
  }
)

// Only index completed orders greater than $100
db.orders.createIndex(
  { customerId: 1, createdAt: -1 },
  {
    partialFilterExpression: {
      status:  "completed",
      amount:  { $gt: 100 }
    },
    name: "idx_customer_created_highvalue"
  }
)

Superset Requirement — Queries Must Include the Filter

DANGER

A partial index is only used if the query filter is a superset of the partialFilterExpression. The query must explicitly include the same conditions (or stricter ones). If the query omits the partial index conditions, MongoDB falls back to COLLSCAN — the index cannot be safely used because it excludes documents the query might need.

// Index with partialFilterExpression: { isActive: true }

// ✅ IXSCAN — query includes isActive: true (superset of filter)
db.users.find({ email: "a@b.com", isActive: true })

// ❌ COLLSCAN — query doesn't specify isActive (might need inactive users too)
db.users.find({ email: "a@b.com" })

// ❌ COLLSCAN — isActive: false is not a superset of isActive: true
db.users.find({ email: "a@b.com", isActive: false })

// ✅ IXSCAN — stricter condition satisfies superset requirement
// (any { isActive: true, role: "admin" } doc is also { isActive: true })
db.users.find({ email: "a@b.com", isActive: true, role: "admin" })

Supported Operators in partialFilterExpression

Operator	Supported	Example
`$eq` (and direct equality)	Yes	`{ status: "active" }`
`$gt`, `$gte`, `$lt`, `$lte`	Yes	`{ amount: { $gt: 100 } }`
`$type`	Yes	`{ field: { $type: "string" } }`
`$exists`	Yes	`{ field: { $exists: true } }`
`$and`	Yes	Multiple conditions
`$or`, `$in`, `$ne`	No	Not supported in partialFilterExpression
`$where`, `$expr`	No	Not supported

TIP

Partial indexes are ideal for status-based access patterns. If 95% of queries target "active" records and only 5% ever look at "archived" records, create a partial index covering active records only. The index is tiny, fast, and always in cache — archived queries can tolerate the COLLSCAN since they're rare.

04

Hidden Index

Maintained but invisible to query planner — safe removal workflow

hidden

A hidden index is fully maintained on writes (B-Tree updated on every insert/update/delete) but the query planner ignores it completely — as if it doesn't exist. This allows you to safely test the impact of removing an index before actually dropping it.

// Create an index that is hidden from the start
db.collection.createIndex(
  { field: 1 },
  { hidden: true, name: "idx_field_hidden" }
)

// Hide an existing index
db.collection.hideIndex("idx_field_name")

// Unhide — makes index visible to query planner again (instantly)
db.collection.unhideIndex("idx_field_name")

Safe Index Removal Workflow

Never drop an index in production without running the hidden index workflow first. Dropping an index is instant but irreversible — recreating a large index can take hours and heavily impacts write performance.

// Step 1: Check current index usage with $indexStats
db.orders.aggregate([{ $indexStats: {} }])
// Look at: name, ops (access count since last restart), since (when tracking started)
// Candidates for removal: ops: 0 or very low for the tracking duration

// Step 2: Hide the index
db.orders.hideIndex("idx_status_created")
// Queries that relied on this index now fall back to other indexes or COLLSCAN
// The index is still maintained on writes — no data loss, instant recovery

// Step 3: Monitor for 24–48 hours (cover at least one business cycle)
// Watch: query response times, slow query logs, application error rates
// MongoDB Atlas has Performance Advisor for automated slow query detection

// Step 4a: If no degradation → index was not used → safe to drop
db.orders.dropIndex("idx_status_created")

// Step 4b: If queries slowed → index WAS needed → unhide immediately
db.orders.unhideIndex("idx_status_created")
// Recovery is instant — no rebuild required

TIP

The hidden index workflow is the safest approach to index management in production. Since unhiding is instantaneous, there is zero risk compared to dropping and rebuilding. Always hide before dropping, especially for indexes on large collections where recreation would be expensive.

Hidden Index Properties

Property	Behavior
Write overhead	Full — B-Tree updated on every write (same as visible index)
Query planner	Completely ignored — no IXSCAN possible
Index size in RAM	Still occupies WiredTiger cache
Hint override	Can force hidden index via `.hint({ field: 1 })` — bypasses hidden status
Visibility in getIndexes()	Shown with `hidden: true` property
Re-enabling time	Instant — no rebuild

05

Collation Index

Locale-aware string comparison for language-specific queries

collation

A collation index defines locale-specific string comparison rules for the index — controlling case sensitivity, accent sensitivity, and character ordering. It enables queries that treat "cafe" and "café" as equal, or "A" and "a" as equal.

// Create a case-insensitive index using English collation
db.users.createIndex(
  { name: 1 },
  {
    collation: { locale: "en", strength: 2 },
    name: "idx_name_ci"
  }
)
// strength: 1 = case and accent insensitive
// strength: 2 = case insensitive, accent sensitive (most common)
// strength: 3 = case sensitive, accent sensitive (default binary)

// Query MUST specify the same collation to use the index
db.users.find(
  { name: "alice" }
).collation({ locale: "en", strength: 2 })
// Matches: "alice", "Alice", "ALICE" — all treated as equal

Collation Query Requirement

WARN

A query only uses a collation index if it specifies the same collation as the index. A query without .collation() will not use the collation index (even for simple equality on the indexed field) and falls back to the default binary comparison or COLLSCAN. Always match collations explicitly.

// Index collation: { locale: "en", strength: 2 }

// ✅ Uses index — collation matches
db.users.find({ name: "alice" }).collation({ locale: "en", strength: 2 })

// ❌ Does NOT use collation index — no collation specified on query
db.users.find({ name: "alice" })
// Falls back to default binary comparison — different results for "Alice" vs "alice"

// Setting default collection collation — all queries use it automatically
db.createCollection("users", {
  collation: { locale: "en", strength: 2 }
})
// Now all queries on this collection use this collation by default

Common Collation Strength Settings

Strength	Case Sensitive	Accent Sensitive	Use Case
1	No	No	"cafe" = "café" = "Café" — broad text matching
2	No	Yes	"alice" = "Alice"; "cafe" ≠ "café" — common CI search
3	Yes	Yes	Default binary-like behavior with locale ordering

06

Combining Properties

unique + sparse, partial + unique, and other combinations

combos

Index properties can be combined to create highly targeted constraints. The most useful combinations solve real-world problems that neither property alone can address.

unique + sparse: Optional Unique Fields

// Allow multiple users without a phone number,
// but enforce uniqueness among those who DO have one
db.users.createIndex(
  { phone: 1 },
  { unique: true, sparse: true, name: "idx_phone_unique_sparse" }
)

db.users.insertOne({ name: "Alice" })                         // ✅ no phone — excluded
db.users.insertOne({ name: "Bob" })                           // ✅ no phone — excluded
db.users.insertOne({ name: "Carol", phone: "+1234567890" })   // ✅ unique phone
db.users.insertOne({ name: "Dave",  phone: "+1234567890" })   // ❌ E11000

partial + unique: Conditional Uniqueness

// Enforce unique email only for active users
// Inactive (soft-deleted) users can share email values with active users
db.users.createIndex(
  { email: 1 },
  {
    unique: true,
    partialFilterExpression: { isActive: true },
    name: "idx_email_unique_active"
  }
)

db.users.insertOne({ email: "a@b.com", isActive: true  })   // ✅ indexed, unique
db.users.insertOne({ email: "a@b.com", isActive: false })   // ✅ not indexed (partial)
db.users.insertOne({ email: "a@b.com", isActive: true  })   // ❌ E11000 (duplicate active)

partial + sparse: Redundancy Note

// partialFilterExpression with { $exists: true } is functionally equivalent to sparse
db.col.createIndex({ field: 1 }, { sparse: true })

// Equivalent partial index:
db.col.createIndex({ field: 1 }, {
  partialFilterExpression: { field: { $exists: true } }
})

// However, partial is more flexible — can add more filter conditions than just existence
// Prefer partial over sparse for new code when possible

NOTE

When choosing between sparse and partial: partial is strictly more powerful than sparse. Sparse only filters on field existence. Partial can filter on any supported operator combination. Use sparse only if you are on an older MongoDB version (< 3.2) where partial isn't available, or for simplicity when existence-only filtering is sufficient.

07

Property Comparison

Choosing the right index property for your use case

reference

Property	Purpose	Key Constraint	Null/Missing Behavior
Standard (default)	General filtering and sorting	None	Indexed as null
`unique`	Enforce value uniqueness	Only one null per index	Null treated as value (only one allowed)
`sparse`	Optional field — reduce index size	Cannot query for null/missing via index	Excluded from index entirely
`partial`	Index a query-relevant subset	Query must be superset of partialFilterExpression	Excluded if doesn't match filter
`hidden`	Safe removal testing	Full write overhead still applies	Standard behavior when visible
`collation`	Locale-aware string comparison	Query must specify matching collation	Standard null handling

Decision Guide

Requirement	Use
Enforce uniqueness, field always present	`unique`
Enforce uniqueness, field optional	`unique + sparse` or `unique + partial`
Field present on only a few documents	`sparse` or `partial ({ $exists: true })`
Only certain documents are queried (e.g., active records)	`partial`
Case-insensitive search	`collation (strength: 2)`
Test removing an index without data loss risk	`hidden`
Conditional uniqueness (unique only within a subset)	`unique + partial`

WARN

Sparse and partial indexes both reduce index size and write overhead — but they also reduce query flexibility. Any query that doesn't satisfy the exclusion criteria (null for sparse, partialFilterExpression for partial) cannot use the index and falls back to COLLSCAN. Always verify all your query patterns before enabling sparse or partial properties.