Sharding

Sharding partitions large collections into smaller segments (shards) based on field values. Instead of scanning an entire collection, queries target only the relevant shards — reducing read latency and improving throughput for high-volume datasets. For example, an orderbook collection with 50 markets and hourly snapshots can be sharded so that a query for market=SUI, hour=12 scans a single shard (~600KB) instead of the full dataset (~30GB).

Shard Key Types

OnDB supports three sharding strategies that can be combined in a hierarchy:

Type	Description	Use Case
`discrete`	Exact value partitioning. Each unique value gets its own shard.	Markets, regions, tenant IDs
`time_range`	Time-based bucketing with configurable granularity (`minute`, `hour`, `day`, `week`, `month`).	Logs, events, time-series data
`hash_distributed`	Even distribution across a fixed number of buckets via hashing.	High-cardinality fields, write-heavy workloads

Setting Up Sharding

With syncCollection (Recommended)

syncCollection creates or updates a collection’s indexes and sharding configuration in a single call. This is the recommended approach for new collections.

import { createClient } from '@ondb/sdk';

const client = createClient({
  endpoint: 'https://api.ondb.io',
  appId: 'my-app',
  appKey: 'your-app-key',
});

await client.syncCollection({
  name: 'orderbook_snapshots',
  fields: {
    market: { type: 'string', index: true },
    timestamp: { type: 'number', index: true },
    mid_price: { type: 'number' },
    spread: { type: 'number' },
  },
  sharding: {
    keys: [
      { field: 'market', type: 'discrete' },
      { field: 'timestamp', type: 'time_range', granularity: 'hour' },
    ],
    enforce_in_queries: true,
    max_shard_size: 50_000_000, // 50MB per shard
  },
});

Method signature:

async syncCollection(
  schema: SimpleCollectionSchemaWithSharding
): Promise<SyncCollectionResult>

The result includes a sharding_configured field that confirms whether sharding was applied.

With setupSharding (Existing Collection)

Use setupSharding to add sharding to a collection that already exists and has data.

await client.setupSharding('orderbook_snapshots', {
  keys: [
    { field: 'market', type: 'discrete' },
    { field: 'timestamp', type: 'time_range', granularity: 'hour' },
  ],
  enforce_in_queries: true,
});

Method signature:

async setupSharding(
  collection: string,
  sharding: ShardingStrategy
): Promise<void>

Shard Key Configuration

ShardingStrategy

interface ShardingStrategy {
  /** Ordered list of shard keys -- each adds a level to the hierarchy */
  keys: ShardKey[];

  /** If true, queries must include all shard key fields */
  enforce_in_queries: boolean;

  /** Max bytes per shard (optional) */
  max_shard_size?: number;
}

ShardKey

interface ShardKey {
  /** Field name to shard by (must be an indexed field) */
  field: string;

  /** Sharding type */
  type: 'discrete' | 'time_range' | 'hash_distributed';

  /** Required for time_range: 'minute' | 'hour' | 'day' | 'week' | 'month' */
  granularity?: 'minute' | 'hour' | 'day' | 'week' | 'month';

  /** Required for hash_distributed: number of hash buckets */
  num_buckets?: number;
}

Examples

Time-Series Data

Partition event logs by day for efficient date-range queries:

await client.syncCollection({
  name: 'system_events',
  fields: {
    event_type: { type: 'string', index: true },
    timestamp: { type: 'date', index: true },
    severity: { type: 'string', index: true },
    message: { type: 'string' },
  },
  sharding: {
    keys: [
      { field: 'timestamp', type: 'time_range', granularity: 'hour' },
    ],
    enforce_in_queries: true,
  },
});

Multi-Tenant Data

Isolate tenant data into discrete shards:

await client.syncCollection({
  name: 'user_activity',
  fields: {
    tenant_id: { type: 'string', index: true },
    user_id: { type: 'string', index: true },
    action: { type: 'string', index: true },
    timestamp: { type: 'date', index: true },
  },
  sharding: {
    keys: [
      { field: 'tenant_id', type: 'discrete' },
    ],
    enforce_in_queries: true,
  },
});

High-Volume Writes

Distribute writes evenly across hash buckets to avoid hotspots:

await client.syncCollection({
  name: 'telemetry',
  fields: {
    device_id: { type: 'string', index: true },
    reading: { type: 'number' },
    timestamp: { type: 'date', index: true },
  },
  sharding: {
    keys: [
      { field: 'device_id', type: 'hash_distributed', num_buckets: 64 },
    ],
    enforce_in_queries: false,
  },
});

Query Considerations

When enforce_in_queries is set to true, every query against the collection must include all shard key fields. This prevents full-collection scans and ensures queries hit only the relevant shards.

// With enforce_in_queries: true and keys [market, timestamp]
// This query targets a single shard:
const results = await client.queryBuilder()
  .collection('orderbook_snapshots')
  .whereField('market').equals('SUI')
  .whereField('timestamp').greaterThanOrEqual(1705305600)
  .whereField('timestamp').lessThan(1705309200)
  .execute();

// Omitting a shard key field will return an error
// when enforce_in_queries is enabled.

If you need to run occasional cross-shard queries (e.g., analytics), set enforce_in_queries: false. Be aware that queries without shard key filters will scan all shards.

Getting Started

Core Concepts

Querying Data

CRUD Operations

Payments

Advanced

AI Tools

Shard Key Types

Setting Up Sharding

With syncCollection (Recommended)

With setupSharding (Existing Collection)

Shard Key Configuration

ShardingStrategy

ShardKey

Examples

Time-Series Data

Multi-Tenant Data

High-Volume Writes

Query Considerations

Next Steps

Collections & Indexes

Query Builder

Getting Started

Core Concepts

Querying Data

CRUD Operations

Payments

Advanced

AI Tools

Documentation Index

​Shard Key Types

​Setting Up Sharding

​With syncCollection (Recommended)

​With setupSharding (Existing Collection)

​Shard Key Configuration

​ShardingStrategy

​ShardKey

​Examples

​Time-Series Data

​Multi-Tenant Data

​High-Volume Writes

​Query Considerations

​Next Steps

Collections & Indexes

Query Builder

Shard Key Types

Setting Up Sharding

With syncCollection (Recommended)

With setupSharding (Existing Collection)

Shard Key Configuration

ShardingStrategy

ShardKey

Examples

Time-Series Data

Multi-Tenant Data

High-Volume Writes

Query Considerations

Next Steps