42 · Time Series

Overview & Creation

Native time series collections — automatic bucketing and compression

overview

Time Series Collections (v5.0+) are a specialized collection type optimized for sequential time-stamped data. MongoDB automatically applies the bucket pattern internally, grouping documents by time and metaField. This provides significant storage savings and faster range queries compared to regular collections.

Feature	Regular Collection	Time Series Collection
Storage	One document per record	Internally bucketed (automatic)
Compression	Standard WiredTiger compression	Delta encoding + heavy compression
Range query speed	Depends on index	Optimized — automatic timeField index
Use case	General purpose	IoT, metrics, logs, financial ticks

// Create a time series collection for IoT sensor data
db.createCollection("sensorReadings", {
  timeseries: {
    timeField:   "timestamp",   // REQUIRED: the date/ISODate field
    metaField:   "sensorId",    // OPTIONAL: groups readings by sensor
    granularity: "seconds"      // "seconds" | "minutes" | "hours"
  },
  expireAfterSeconds: 86400 * 90  // auto-delete after 90 days (optional)
})

// Create for stock price ticks
db.createCollection("stockTicks", {
  timeseries: {
    timeField:   "ts",
    metaField:   "ticker",    // groups AAPL ticks together in same bucket
    granularity: "seconds"
  }
})

// Create for application metrics (no natural device grouping)
db.createCollection("appMetrics", {
  timeseries: {
    timeField: "collectedAt",
    granularity: "minutes"    // no metaField if no natural grouping dimension
  }
})

Granularity Guide

Granularity	Bucket Span	Best For
`seconds`	1 hour per bucket	High-frequency sensor data (<1min intervals)
`minutes`	24 hours per bucket	Per-minute metrics, application logs
`hours`	30 days per bucket	Daily aggregates, hourly summaries

Inserting Time Series Data

timeField must be a Date — metadata and measurements pattern

insert

// Insert a single reading
db.sensorReadings.insertOne({
  timestamp: new Date(),          // MUST be a Date type
  sensorId:  "sensor-042",        // metaField value — identifies the source
  temperature: 22.4,
  humidity:    58.2,
  pressure:    1013.25
})

// Bulk insert — optimal: all in same time window + same metaField value
db.sensorReadings.insertMany([
  { timestamp: ISODate("2024-03-01T10:00:00Z"), sensorId: "S1", temp: 22.4, humidity: 60 },
  { timestamp: ISODate("2024-03-01T10:00:05Z"), sensorId: "S1", temp: 22.5, humidity: 61 },
  { timestamp: ISODate("2024-03-01T10:00:10Z"), sensorId: "S1", temp: 22.3, humidity: 60 },
  { timestamp: ISODate("2024-03-01T10:00:00Z"), sensorId: "S2", temp: 19.1, humidity: 45 }
])
// MongoDB groups S1 readings into one bucket, S2 into another

// Best practices for insertion efficiency:
// 1. Insert in timestamp order per metaField (reduces bucket rewrites)
// 2. Batch inserts: insertMany is far more efficient than insertOne in a loop
// 3. Keep all measurements for one sensor in the same batch call where possible
// 4. Don't backfill old timestamps into active buckets (creates new buckets for old time range)

// metaField: recommended document structure
// metaField can be a nested document for richer grouping:
db.metrics.insertOne({
  collectedAt: new Date(),
  source: {              // metaField = "source"
    host:    "web-01",
    region:  "us-east-1",
    service: "api"
  },
  cpu:    72.3,
  memory: 4096,
  requests: 1247
})

Querying & Aggregation

Range queries, downsampling, and time-based grouping

queries

// Range query — automatically uses timeField index
db.sensorReadings.find({
  timestamp: {
    $gte: ISODate("2024-03-01T00:00:00Z"),
    $lt:  ISODate("2024-03-02T00:00:00Z")
  },
  sensorId: "sensor-042"   // metaField filter — bucket pruning
})

// Downsample: hourly averages from per-second data
db.sensorReadings.aggregate([
  {
    $match: {
      timestamp: {
        $gte: ISODate("2024-03-01T00:00:00Z"),
        $lt:  ISODate("2024-03-08T00:00:00Z")
      }
    }
  },
  {
    $group: {
      _id: {
        sensorId: "$sensorId",
        hour: {
          $dateTrunc: {
            date: "$timestamp",
            unit: "hour"
          }
        }
      },
      avgTemp:   { $avg: "$temperature" },
      maxTemp:   { $max: "$temperature" },
      minTemp:   { $min: "$temperature" },
      readings:  { $sum: 1 }
    }
  },
  { $sort: { "_id.hour": 1, "_id.sensorId": 1 } }
])

// $dateTrunc for flexible time bucketing:
// unit: "minute" | "hour" | "day" | "week" | "month" | "quarter" | "year"
// binSize: group by N units (e.g., 15-minute buckets)
{
  $dateTrunc: {
    date:    "$timestamp",
    unit:    "minute",
    binSize: 15    // 15-minute intervals
  }
}

// Last value per sensor (most recent reading)
db.sensorReadings.aggregate([
  { $sort: { sensorId: 1, timestamp: -1 } },
  { $group: {
    _id:         "$sensorId",
    lastReading: { $first: "$$ROOT" }
  }}
])

$setWindowFields

Window functions — running totals, moving averages, ranks

windows

$setWindowFields (v5.0+) adds SQL-style window functions to MongoDB aggregation. It computes values over a sliding or fixed window of documents without collapsing them into groups — all input documents are preserved in output.

// Moving average — 5-reading rolling average temperature per sensor
db.sensorReadings.aggregate([
  { $sort: { sensorId: 1, timestamp: 1 } },
  {
    $setWindowFields: {
      partitionBy: "$sensorId",     // calculate independently per sensor
      sortBy:      { timestamp: 1 },
      output: {
        movingAvgTemp: {
          $avg: "$temperature",
          window: {
            documents: [-4, 0]      // current doc + 4 before it (5 total)
          }
        }
      }
    }
  }
])

// Cumulative sum — total readings per sensor over time
db.sensorReadings.aggregate([
  {
    $setWindowFields: {
      partitionBy: "$sensorId",
      sortBy:      { timestamp: 1 },
      output: {
        cumulativeReadings: {
          $sum:   { $literal: 1 },
          window: { documents: ["unbounded", "current"] }  // all from start to current
        },
        runningTotalTemp: {
          $sum:   "$temperature",
          window: { documents: ["unbounded", "current"] }
        }
      }
    }
  }
])

// Range-based window (time range instead of document count)
db.sensorReadings.aggregate([
  {
    $setWindowFields: {
      partitionBy: "$sensorId",
      sortBy:      { timestamp: 1 },
      output: {
        hourlyAvgTemp: {
          $avg: "$temperature",
          window: {
            range: [-3600000, 0],   // last 3600000ms (1 hour) before current doc
            unit:  "millisecond"
          }
        }
      }
    }
  }
])

// Rank and dense rank
db.leaderboard.aggregate([
  {
    $setWindowFields: {
      sortBy: { score: -1 },
      output: {
        rank:      { $rank: {} },
        denseRank: { $denseRank: {} },
        docNumber: { $documentNumber: {} }
      }
    }
  }
])

TTL & Expiry

Automatic data deletion based on age

TTL

// Configure TTL at collection creation (recommended for time series)
db.createCollection("sensorReadings", {
  timeseries: {
    timeField:   "timestamp",
    metaField:   "sensorId",
    granularity: "seconds"
  },
  expireAfterSeconds: 86400 * 30   // data expires after 30 days
})

// Update TTL on existing time series collection
db.runCommand({
  collMod: "sensorReadings",
  expireAfterSeconds: 86400 * 60   // change to 60 days
})

// Disable TTL (make data permanent)
db.runCommand({
  collMod: "sensorReadings",
  expireAfterSeconds: 0    // 0 = off (keep forever)
})

// TTL for regular collections (non-time-series) — index on date field
db.logs.createIndex({ createdAt: 1 }, { expireAfterSeconds: 604800 })  // 7 days
// TTL monitor runs every 60 seconds — deletion is not instantaneous
// Documents expire when: current_time > createdAt + expireAfterSeconds

NOTE

TTL deletion works at the bucket level for time series collections — an entire bucket is deleted when all documents in it are past the expiry threshold. Partial bucket deletion does not occur, so actual data may live slightly longer than expireAfterSeconds (up to one extra granularity window). This is expected behavior.

$densify & $fill

Fill gaps in time series data

gaps

$densify fills in missing time points so that a result set has continuous time coverage. $fill populates null/missing values in fields using forward-fill, backward-fill, or linear interpolation.

// $densify — generate a document for every 1-hour slot in a range
// Even if there was no reading in that hour, a document is created
db.sensorReadings.aggregate([
  {
    $densify: {
      field: "timestamp",
      partitionByFields: ["sensorId"],    // per sensor
      range: {
        step:  1,
        unit:  "hour",
        bounds: [
          ISODate("2024-03-01T00:00:00Z"),
          ISODate("2024-03-02T00:00:00Z")
        ]
      }
    }
  }
])
// Output: 24 documents per sensor; missing hours have null for measurements

// $fill — forward-fill missing temperature values (carry last known value)
db.sensorReadings.aggregate([
  {
    $densify: {
      field:  "timestamp",
      range: { step: 1, unit: "hour", bounds: "full" }
    }
  },
  {
    $fill: {
      sortBy:    { timestamp: 1 },
      partitionByFields: ["sensorId"],
      output: {
        temperature: { method: "locf"   },  // Last Observation Carried Forward
        humidity:    { method: "locf"   },
        pressure:    { method: "linear" }   // linear interpolation between known values
      }
    }
  }
])

// $fill with a constant value for missing points
db.metrics.aggregate([
  { $fill: {
    sortBy: { ts: 1 },
    output: {
      requests: { value: 0 }   // fill missing with zero (no traffic = 0 requests)
    }
  }}
])

Limitations & Tips

What time series collections cannot do

limits

Limitation	Workaround
No document updates or deletes (5.0–6.0)	Use new insertions for corrections; delete via TTL
Limited update/delete support (from 6.3+)	Only filter by metaField or time range
Cannot change timeField or metaField after creation	Create new collection; migrate data
Cannot add custom indexes beyond auto timeField index	Create index on metaField or measurement fields
Cannot use transactions	Design to not require cross-collection atomicity for time series
No _id guaranteed to match internal storage order	Always sort by timeField for ordered results
Cannot use $lookup with time series as source	Aggregate first then join; use regular collection for lookup target

// Add secondary index on metaField for faster per-device queries
db.sensorReadings.createIndex({ sensorId: 1 })     // metaField index
db.sensorReadings.createIndex({ "source.region": 1 })  // nested metaField

// Check time series collection stats
db.runCommand({ collStats: "sensorReadings" })
// numOrphanDocs: 0 (healthy)  — orphaned docs indicate write failures
// storageSize vs dataSize: time series shows high compression ratio

// Verify collection is recognized as time series:
db.getCollectionInfos({ name: "sensorReadings" })
// options.timeseries: { timeField, metaField, granularity } confirms it

TIP

For IoT/metrics projects starting fresh on MongoDB 5.0+, always prefer native time series collections over the manual bucket pattern. The automatic bucketing, built-in compression, and optimized range queries far outperform hand-rolled bucket implementations. Use the manual bucket pattern only when you need update/delete support or are on MongoDB < 5.0.