← back

Time
Series

FILE  42_time_series
TOPIC  Time Series Collections · timeField · metaField · $setWindowFields · TTL · $densify
LEVEL  Intermediate/Advanced
01
Overview & Creation
Native time series collections — automatic bucketing and compression
overview

Time Series Collections (v5.0+) are a specialized collection type optimized for sequential time-stamped data. MongoDB automatically applies the bucket pattern internally, grouping documents by time and metaField. This provides significant storage savings and faster range queries compared to regular collections.

FeatureRegular CollectionTime Series Collection
StorageOne document per recordInternally bucketed (automatic)
CompressionStandard WiredTiger compressionDelta encoding + heavy compression
Range query speedDepends on indexOptimized — automatic timeField index
Use caseGeneral purposeIoT, metrics, logs, financial ticks
// Create a time series collection for IoT sensor data
db.createCollection("sensorReadings", {
  timeseries: {
    timeField:   "timestamp",   // REQUIRED: the date/ISODate field
    metaField:   "sensorId",    // OPTIONAL: groups readings by sensor
    granularity: "seconds"      // "seconds" | "minutes" | "hours"
  },
  expireAfterSeconds: 86400 * 90  // auto-delete after 90 days (optional)
})

// Create for stock price ticks
db.createCollection("stockTicks", {
  timeseries: {
    timeField:   "ts",
    metaField:   "ticker",    // groups AAPL ticks together in same bucket
    granularity: "seconds"
  }
})

// Create for application metrics (no natural device grouping)
db.createCollection("appMetrics", {
  timeseries: {
    timeField: "collectedAt",
    granularity: "minutes"    // no metaField if no natural grouping dimension
  }
})

Granularity Guide

GranularityBucket SpanBest For
seconds1 hour per bucketHigh-frequency sensor data (<1min intervals)
minutes24 hours per bucketPer-minute metrics, application logs
hours30 days per bucketDaily aggregates, hourly summaries
02
Inserting Time Series Data
timeField must be a Date — metadata and measurements pattern
insert
// Insert a single reading
db.sensorReadings.insertOne({
  timestamp: new Date(),          // MUST be a Date type
  sensorId:  "sensor-042",        // metaField value — identifies the source
  temperature: 22.4,
  humidity:    58.2,
  pressure:    1013.25
})

// Bulk insert — optimal: all in same time window + same metaField value
db.sensorReadings.insertMany([
  { timestamp: ISODate("2024-03-01T10:00:00Z"), sensorId: "S1", temp: 22.4, humidity: 60 },
  { timestamp: ISODate("2024-03-01T10:00:05Z"), sensorId: "S1", temp: 22.5, humidity: 61 },
  { timestamp: ISODate("2024-03-01T10:00:10Z"), sensorId: "S1", temp: 22.3, humidity: 60 },
  { timestamp: ISODate("2024-03-01T10:00:00Z"), sensorId: "S2", temp: 19.1, humidity: 45 }
])
// MongoDB groups S1 readings into one bucket, S2 into another

// Best practices for insertion efficiency:
// 1. Insert in timestamp order per metaField (reduces bucket rewrites)
// 2. Batch inserts: insertMany is far more efficient than insertOne in a loop
// 3. Keep all measurements for one sensor in the same batch call where possible
// 4. Don't backfill old timestamps into active buckets (creates new buckets for old time range)

// metaField: recommended document structure
// metaField can be a nested document for richer grouping:
db.metrics.insertOne({
  collectedAt: new Date(),
  source: {              // metaField = "source"
    host:    "web-01",
    region:  "us-east-1",
    service: "api"
  },
  cpu:    72.3,
  memory: 4096,
  requests: 1247
})
03
Querying & Aggregation
Range queries, downsampling, and time-based grouping
queries
// Range query — automatically uses timeField index
db.sensorReadings.find({
  timestamp: {
    $gte: ISODate("2024-03-01T00:00:00Z"),
    $lt:  ISODate("2024-03-02T00:00:00Z")
  },
  sensorId: "sensor-042"   // metaField filter — bucket pruning
})

// Downsample: hourly averages from per-second data
db.sensorReadings.aggregate([
  {
    $match: {
      timestamp: {
        $gte: ISODate("2024-03-01T00:00:00Z"),
        $lt:  ISODate("2024-03-08T00:00:00Z")
      }
    }
  },
  {
    $group: {
      _id: {
        sensorId: "$sensorId",
        hour: {
          $dateTrunc: {
            date: "$timestamp",
            unit: "hour"
          }
        }
      },
      avgTemp:   { $avg: "$temperature" },
      maxTemp:   { $max: "$temperature" },
      minTemp:   { $min: "$temperature" },
      readings:  { $sum: 1 }
    }
  },
  { $sort: { "_id.hour": 1, "_id.sensorId": 1 } }
])

// $dateTrunc for flexible time bucketing:
// unit: "minute" | "hour" | "day" | "week" | "month" | "quarter" | "year"
// binSize: group by N units (e.g., 15-minute buckets)
{
  $dateTrunc: {
    date:    "$timestamp",
    unit:    "minute",
    binSize: 15    // 15-minute intervals
  }
}

// Last value per sensor (most recent reading)
db.sensorReadings.aggregate([
  { $sort: { sensorId: 1, timestamp: -1 } },
  { $group: {
    _id:         "$sensorId",
    lastReading: { $first: "$$ROOT" }
  }}
])
04
$setWindowFields
Window functions — running totals, moving averages, ranks
windows

$setWindowFields (v5.0+) adds SQL-style window functions to MongoDB aggregation. It computes values over a sliding or fixed window of documents without collapsing them into groups — all input documents are preserved in output.

// Moving average — 5-reading rolling average temperature per sensor
db.sensorReadings.aggregate([
  { $sort: { sensorId: 1, timestamp: 1 } },
  {
    $setWindowFields: {
      partitionBy: "$sensorId",     // calculate independently per sensor
      sortBy:      { timestamp: 1 },
      output: {
        movingAvgTemp: {
          $avg: "$temperature",
          window: {
            documents: [-4, 0]      // current doc + 4 before it (5 total)
          }
        }
      }
    }
  }
])

// Cumulative sum — total readings per sensor over time
db.sensorReadings.aggregate([
  {
    $setWindowFields: {
      partitionBy: "$sensorId",
      sortBy:      { timestamp: 1 },
      output: {
        cumulativeReadings: {
          $sum:   { $literal: 1 },
          window: { documents: ["unbounded", "current"] }  // all from start to current
        },
        runningTotalTemp: {
          $sum:   "$temperature",
          window: { documents: ["unbounded", "current"] }
        }
      }
    }
  }
])

// Range-based window (time range instead of document count)
db.sensorReadings.aggregate([
  {
    $setWindowFields: {
      partitionBy: "$sensorId",
      sortBy:      { timestamp: 1 },
      output: {
        hourlyAvgTemp: {
          $avg: "$temperature",
          window: {
            range: [-3600000, 0],   // last 3600000ms (1 hour) before current doc
            unit:  "millisecond"
          }
        }
      }
    }
  }
])

// Rank and dense rank
db.leaderboard.aggregate([
  {
    $setWindowFields: {
      sortBy: { score: -1 },
      output: {
        rank:      { $rank: {} },
        denseRank: { $denseRank: {} },
        docNumber: { $documentNumber: {} }
      }
    }
  }
])
05
TTL & Expiry
Automatic data deletion based on age
TTL
// Configure TTL at collection creation (recommended for time series)
db.createCollection("sensorReadings", {
  timeseries: {
    timeField:   "timestamp",
    metaField:   "sensorId",
    granularity: "seconds"
  },
  expireAfterSeconds: 86400 * 30   // data expires after 30 days
})

// Update TTL on existing time series collection
db.runCommand({
  collMod: "sensorReadings",
  expireAfterSeconds: 86400 * 60   // change to 60 days
})

// Disable TTL (make data permanent)
db.runCommand({
  collMod: "sensorReadings",
  expireAfterSeconds: 0    // 0 = off (keep forever)
})

// TTL for regular collections (non-time-series) — index on date field
db.logs.createIndex({ createdAt: 1 }, { expireAfterSeconds: 604800 })  // 7 days
// TTL monitor runs every 60 seconds — deletion is not instantaneous
// Documents expire when: current_time > createdAt + expireAfterSeconds
NOTE
TTL deletion works at the bucket level for time series collections — an entire bucket is deleted when all documents in it are past the expiry threshold. Partial bucket deletion does not occur, so actual data may live slightly longer than expireAfterSeconds (up to one extra granularity window). This is expected behavior.
06
$densify & $fill
Fill gaps in time series data
gaps

$densify fills in missing time points so that a result set has continuous time coverage. $fill populates null/missing values in fields using forward-fill, backward-fill, or linear interpolation.

// $densify — generate a document for every 1-hour slot in a range
// Even if there was no reading in that hour, a document is created
db.sensorReadings.aggregate([
  {
    $densify: {
      field: "timestamp",
      partitionByFields: ["sensorId"],    // per sensor
      range: {
        step:  1,
        unit:  "hour",
        bounds: [
          ISODate("2024-03-01T00:00:00Z"),
          ISODate("2024-03-02T00:00:00Z")
        ]
      }
    }
  }
])
// Output: 24 documents per sensor; missing hours have null for measurements

// $fill — forward-fill missing temperature values (carry last known value)
db.sensorReadings.aggregate([
  {
    $densify: {
      field:  "timestamp",
      range: { step: 1, unit: "hour", bounds: "full" }
    }
  },
  {
    $fill: {
      sortBy:    { timestamp: 1 },
      partitionByFields: ["sensorId"],
      output: {
        temperature: { method: "locf"   },  // Last Observation Carried Forward
        humidity:    { method: "locf"   },
        pressure:    { method: "linear" }   // linear interpolation between known values
      }
    }
  }
])

// $fill with a constant value for missing points
db.metrics.aggregate([
  { $fill: {
    sortBy: { ts: 1 },
    output: {
      requests: { value: 0 }   // fill missing with zero (no traffic = 0 requests)
    }
  }}
])
07
Limitations & Tips
What time series collections cannot do
limits
LimitationWorkaround
No document updates or deletes (5.0–6.0)Use new insertions for corrections; delete via TTL
Limited update/delete support (from 6.3+)Only filter by metaField or time range
Cannot change timeField or metaField after creationCreate new collection; migrate data
Cannot add custom indexes beyond auto timeField indexCreate index on metaField or measurement fields
Cannot use transactionsDesign to not require cross-collection atomicity for time series
No _id guaranteed to match internal storage orderAlways sort by timeField for ordered results
Cannot use $lookup with time series as sourceAggregate first then join; use regular collection for lookup target
// Add secondary index on metaField for faster per-device queries
db.sensorReadings.createIndex({ sensorId: 1 })     // metaField index
db.sensorReadings.createIndex({ "source.region": 1 })  // nested metaField

// Check time series collection stats
db.runCommand({ collStats: "sensorReadings" })
// numOrphanDocs: 0 (healthy)  — orphaned docs indicate write failures
// storageSize vs dataSize: time series shows high compression ratio

// Verify collection is recognized as time series:
db.getCollectionInfos({ name: "sensorReadings" })
// options.timeseries: { timeField, metaField, granularity } confirms it
TIP
For IoT/metrics projects starting fresh on MongoDB 5.0+, always prefer native time series collections over the manual bucket pattern. The automatic bucketing, built-in compression, and optimized range queries far outperform hand-rolled bucket implementations. Use the manual bucket pattern only when you need update/delete support or are on MongoDB < 5.0.