← back

Index
Properties

FILE  26_index_properties
TOPIC  Unique · Sparse · Partial · Hidden · Collation · Property Comparison
LEVEL  Intermediate
01
Unique Index
Enforce uniqueness — E11000 on duplicate insert
unique

A unique index rejects any write that would create a duplicate value for the indexed field(s). MongoDB enforces this constraint at the storage layer — no application-level duplicate check is needed.

// Single field unique index
db.users.createIndex({ email: 1 }, { unique: true, name: "idx_email_unique" })

// Compound unique index — the COMBINATION must be unique
// (userId, date) can repeat userId, repeat date — but not the same (userId, date) pair
db.checkins.createIndex(
  { userId: 1, date: 1 },
  { unique: true, name: "idx_user_date_unique" }
)

// Duplicate insert → WriteError E11000
// E11000 duplicate key error collection: mydb.users index: idx_email_unique
// dup key: { : "user@example.com" }

Null and Missing Fields with Unique Indexes

DANGER
A standard unique index treats null as a value. If the indexed field is missing or null in multiple documents, they all have null as their index key — which violates uniqueness. You can only have one document with a missing/null value in a standard unique index. Use unique + sparse or unique + partial to allow multiple documents with missing values.
// Standard unique index on email — only ONE null allowed
db.users.createIndex({ email: 1 }, { unique: true })

db.users.insertOne({ name: "Alice" })             // ✅ no email — null
db.users.insertOne({ name: "Bob" })               // ❌ E11000 — second null

// Fix: unique + sparse — excludes missing/null docs from index entirely
db.users.createIndex({ email: 1 }, { unique: true, sparse: true })
db.users.insertOne({ name: "Alice" })             // ✅ excluded from index
db.users.insertOne({ name: "Bob" })               // ✅ excluded from index
db.users.insertOne({ name: "Carol", email: "c@x.com" }) // ✅ in index, unique
db.users.insertOne({ name: "Dave",  email: "c@x.com" }) // ❌ E11000

Unique Index Edge Cases

ScenarioBehavior
Inserting duplicate valueWriteError E11000
Creating unique index on existing collection with duplicatesIndex creation fails — must resolve duplicates first
Two documents with missing field (standard unique)Second insert fails — both have null key
Unique + sparse with missing fieldMultiple missing-field docs allowed — excluded from index
Case sensitivity"User@example.com" ≠ "user@example.com" — unique treats them as distinct
Sharded collection unique indexUnique only if shard key is a prefix of the unique index key
02
Sparse Index
Index only documents where the field exists — excludes null/missing
sparse

A sparse index only includes documents where the indexed field exists and is not null. Documents that are missing the field are excluded from the index entirely. This reduces index size when the field is optional and present in only a subset of documents.

// Standard index — all documents indexed (missing field stored as null)
db.users.createIndex({ linkedInUrl: 1 })
// Every user document indexed, even those without linkedInUrl

// Sparse index — only users WITH linkedInUrl are indexed
db.users.createIndex({ linkedInUrl: 1 }, { sparse: true, name: "idx_linkedin_sparse" })
// Documents without linkedInUrl are NOT in the index — smaller index size

The Critical Sparse Index Gotcha

DANGER
Querying for null or missing values on a sparse-indexed field falls back to COLLSCAN. Sparse indexes exclude documents with null/missing fields, so they cannot satisfy queries that specifically look for null or missing. This is the most common sparse index pitfall.
// Sparse index on linkedInUrl
db.users.createIndex({ linkedInUrl: 1 }, { sparse: true })

// ✅ Uses sparse index — finds existing linkedInUrl values
db.users.find({ linkedInUrl: "https://linkedin.com/in/alice" })

// ❌ COLLSCAN — querying for null/missing falls back (index excludes these docs)
db.users.find({ linkedInUrl: null })
db.users.find({ linkedInUrl: { $exists: false } })

// ❌ COLLSCAN — sort using sparse index when collection has docs without the field
// MongoDB cannot use sparse index for sort if it would exclude docs from results
db.users.find({}).sort({ linkedInUrl: 1 })

Standard vs Sparse Index Behavior

ScenarioStandard IndexSparse Index
Document missing indexed fieldIndexed as nullExcluded from index
Query { field: null }IXSCANCOLLSCAN (excluded)
Query { field: { $exists: false } }IXSCANCOLLSCAN
Query { field: "value" }IXSCANIXSCAN
Sort on field (all docs)Index can serve sortCOLLSCAN (excluded docs)
Index sizeFull collection sizeOnly docs with field
TIP
Use sparse indexes for optional fields that are present in less than ~20% of documents. A good example: an ssoToken field that only exists for enterprise accounts. The index stays small and only serves queries for SSO users — the common case query for non-SSO users won't be affected because they don't have the field at all.
03
Partial Index
Index only documents matching a filter expression — precise subset indexing
partial

A partial index uses a partialFilterExpression to index only the subset of documents that satisfy the filter. Unlike sparse (which only filters on field existence), partial can filter using any supported operator — enabling targeted indexes for high-frequency query patterns.

// Only index active users — queries on inactive users fall back to COLLSCAN
db.users.createIndex(
  { email: 1 },
  {
    partialFilterExpression: { isActive: true },
    name: "idx_email_active_partial"
  }
)

// Only index completed orders greater than $100
db.orders.createIndex(
  { customerId: 1, createdAt: -1 },
  {
    partialFilterExpression: {
      status:  "completed",
      amount:  { $gt: 100 }
    },
    name: "idx_customer_created_highvalue"
  }
)

Superset Requirement — Queries Must Include the Filter

DANGER
A partial index is only used if the query filter is a superset of the partialFilterExpression. The query must explicitly include the same conditions (or stricter ones). If the query omits the partial index conditions, MongoDB falls back to COLLSCAN — the index cannot be safely used because it excludes documents the query might need.
// Index with partialFilterExpression: { isActive: true }

// ✅ IXSCAN — query includes isActive: true (superset of filter)
db.users.find({ email: "a@b.com", isActive: true })

// ❌ COLLSCAN — query doesn't specify isActive (might need inactive users too)
db.users.find({ email: "a@b.com" })

// ❌ COLLSCAN — isActive: false is not a superset of isActive: true
db.users.find({ email: "a@b.com", isActive: false })

// ✅ IXSCAN — stricter condition satisfies superset requirement
// (any { isActive: true, role: "admin" } doc is also { isActive: true })
db.users.find({ email: "a@b.com", isActive: true, role: "admin" })

Supported Operators in partialFilterExpression

OperatorSupportedExample
$eq (and direct equality)Yes{ status: "active" }
$gt, $gte, $lt, $lteYes{ amount: { $gt: 100 } }
$typeYes{ field: { $type: "string" } }
$existsYes{ field: { $exists: true } }
$andYesMultiple conditions
$or, $in, $neNoNot supported in partialFilterExpression
$where, $exprNoNot supported
TIP
Partial indexes are ideal for status-based access patterns. If 95% of queries target "active" records and only 5% ever look at "archived" records, create a partial index covering active records only. The index is tiny, fast, and always in cache — archived queries can tolerate the COLLSCAN since they're rare.
04
Hidden Index
Maintained but invisible to query planner — safe removal workflow
hidden

A hidden index is fully maintained on writes (B-Tree updated on every insert/update/delete) but the query planner ignores it completely — as if it doesn't exist. This allows you to safely test the impact of removing an index before actually dropping it.

// Create an index that is hidden from the start
db.collection.createIndex(
  { field: 1 },
  { hidden: true, name: "idx_field_hidden" }
)

// Hide an existing index
db.collection.hideIndex("idx_field_name")

// Unhide — makes index visible to query planner again (instantly)
db.collection.unhideIndex("idx_field_name")

Safe Index Removal Workflow

Never drop an index in production without running the hidden index workflow first. Dropping an index is instant but irreversible — recreating a large index can take hours and heavily impacts write performance.

// Step 1: Check current index usage with $indexStats
db.orders.aggregate([{ $indexStats: {} }])
// Look at: name, ops (access count since last restart), since (when tracking started)
// Candidates for removal: ops: 0 or very low for the tracking duration

// Step 2: Hide the index
db.orders.hideIndex("idx_status_created")
// Queries that relied on this index now fall back to other indexes or COLLSCAN
// The index is still maintained on writes — no data loss, instant recovery

// Step 3: Monitor for 24–48 hours (cover at least one business cycle)
// Watch: query response times, slow query logs, application error rates
// MongoDB Atlas has Performance Advisor for automated slow query detection

// Step 4a: If no degradation → index was not used → safe to drop
db.orders.dropIndex("idx_status_created")

// Step 4b: If queries slowed → index WAS needed → unhide immediately
db.orders.unhideIndex("idx_status_created")
// Recovery is instant — no rebuild required
TIP
The hidden index workflow is the safest approach to index management in production. Since unhiding is instantaneous, there is zero risk compared to dropping and rebuilding. Always hide before dropping, especially for indexes on large collections where recreation would be expensive.

Hidden Index Properties

PropertyBehavior
Write overheadFull — B-Tree updated on every write (same as visible index)
Query plannerCompletely ignored — no IXSCAN possible
Index size in RAMStill occupies WiredTiger cache
Hint overrideCan force hidden index via .hint({ field: 1 }) — bypasses hidden status
Visibility in getIndexes()Shown with hidden: true property
Re-enabling timeInstant — no rebuild
05
Collation Index
Locale-aware string comparison for language-specific queries
collation

A collation index defines locale-specific string comparison rules for the index — controlling case sensitivity, accent sensitivity, and character ordering. It enables queries that treat "cafe" and "café" as equal, or "A" and "a" as equal.

// Create a case-insensitive index using English collation
db.users.createIndex(
  { name: 1 },
  {
    collation: { locale: "en", strength: 2 },
    name: "idx_name_ci"
  }
)
// strength: 1 = case and accent insensitive
// strength: 2 = case insensitive, accent sensitive (most common)
// strength: 3 = case sensitive, accent sensitive (default binary)

// Query MUST specify the same collation to use the index
db.users.find(
  { name: "alice" }
).collation({ locale: "en", strength: 2 })
// Matches: "alice", "Alice", "ALICE" — all treated as equal

Collation Query Requirement

WARN
A query only uses a collation index if it specifies the same collation as the index. A query without .collation() will not use the collation index (even for simple equality on the indexed field) and falls back to the default binary comparison or COLLSCAN. Always match collations explicitly.
// Index collation: { locale: "en", strength: 2 }

// ✅ Uses index — collation matches
db.users.find({ name: "alice" }).collation({ locale: "en", strength: 2 })

// ❌ Does NOT use collation index — no collation specified on query
db.users.find({ name: "alice" })
// Falls back to default binary comparison — different results for "Alice" vs "alice"

// Setting default collection collation — all queries use it automatically
db.createCollection("users", {
  collation: { locale: "en", strength: 2 }
})
// Now all queries on this collection use this collation by default

Common Collation Strength Settings

StrengthCase SensitiveAccent SensitiveUse Case
1NoNo"cafe" = "café" = "Café" — broad text matching
2NoYes"alice" = "Alice"; "cafe" ≠ "café" — common CI search
3YesYesDefault binary-like behavior with locale ordering
06
Combining Properties
unique + sparse, partial + unique, and other combinations
combos

Index properties can be combined to create highly targeted constraints. The most useful combinations solve real-world problems that neither property alone can address.

unique + sparse: Optional Unique Fields

// Allow multiple users without a phone number,
// but enforce uniqueness among those who DO have one
db.users.createIndex(
  { phone: 1 },
  { unique: true, sparse: true, name: "idx_phone_unique_sparse" }
)

db.users.insertOne({ name: "Alice" })                         // ✅ no phone — excluded
db.users.insertOne({ name: "Bob" })                           // ✅ no phone — excluded
db.users.insertOne({ name: "Carol", phone: "+1234567890" })   // ✅ unique phone
db.users.insertOne({ name: "Dave",  phone: "+1234567890" })   // ❌ E11000

partial + unique: Conditional Uniqueness

// Enforce unique email only for active users
// Inactive (soft-deleted) users can share email values with active users
db.users.createIndex(
  { email: 1 },
  {
    unique: true,
    partialFilterExpression: { isActive: true },
    name: "idx_email_unique_active"
  }
)

db.users.insertOne({ email: "a@b.com", isActive: true  })   // ✅ indexed, unique
db.users.insertOne({ email: "a@b.com", isActive: false })   // ✅ not indexed (partial)
db.users.insertOne({ email: "a@b.com", isActive: true  })   // ❌ E11000 (duplicate active)

partial + sparse: Redundancy Note

// partialFilterExpression with { $exists: true } is functionally equivalent to sparse
db.col.createIndex({ field: 1 }, { sparse: true })

// Equivalent partial index:
db.col.createIndex({ field: 1 }, {
  partialFilterExpression: { field: { $exists: true } }
})

// However, partial is more flexible — can add more filter conditions than just existence
// Prefer partial over sparse for new code when possible
NOTE
When choosing between sparse and partial: partial is strictly more powerful than sparse. Sparse only filters on field existence. Partial can filter on any supported operator combination. Use sparse only if you are on an older MongoDB version (< 3.2) where partial isn't available, or for simplicity when existence-only filtering is sufficient.
07
Property Comparison
Choosing the right index property for your use case
reference
PropertyPurposeKey ConstraintNull/Missing Behavior
Standard (default)General filtering and sortingNoneIndexed as null
uniqueEnforce value uniquenessOnly one null per indexNull treated as value (only one allowed)
sparseOptional field — reduce index sizeCannot query for null/missing via indexExcluded from index entirely
partialIndex a query-relevant subsetQuery must be superset of partialFilterExpressionExcluded if doesn't match filter
hiddenSafe removal testingFull write overhead still appliesStandard behavior when visible
collationLocale-aware string comparisonQuery must specify matching collationStandard null handling

Decision Guide

RequirementUse
Enforce uniqueness, field always presentunique
Enforce uniqueness, field optionalunique + sparse or unique + partial
Field present on only a few documentssparse or partial ({ $exists: true })
Only certain documents are queried (e.g., active records)partial
Case-insensitive searchcollation (strength: 2)
Test removing an index without data loss riskhidden
Conditional uniqueness (unique only within a subset)unique + partial
WARN
Sparse and partial indexes both reduce index size and write overhead — but they also reduce query flexibility. Any query that doesn't satisfy the exclusion criteria (null for sparse, partialFilterExpression for partial) cannot use the index and falls back to COLLSCAN. Always verify all your query patterns before enabling sparse or partial properties.