Skip to content

Sharding and partitioning guidelines #301

@kneth

Description

@kneth

Sharding and partitioning are two important topic to understand when working with CrateDB. It is documented but information is scattered which makes it harder for our customers to create optimal strategies.

Currently, we have information about it in

ChatGPT can summarize it to:

Optimal / Recommended Settings for CrateDB

  1. Shard Size
    Target 3–70 GB per shard (sweet spot: 20–40 GB).
    Avoid:

    • Tiny shards (<1 GB): too much metadata overhead.
    • Huge shards (>100 GB): slow recovery and queries.
  2. Number of Shards

  • Per partition: Choose #shards so each shard lands in the 3–70 GB range.
  • Example: If a monthly partition is ~210 GB → 6 shards ≈ 35 GB each.
  • Per node: Keep <1000 shards total (including replicas).
  • Cluster-wide: At least as many shards as CPU cores (to allow parallelism), but don’t multiply shards just to “match” cores.
  1. Partitioning Strategy
  • Use time-based partitions for time-series data:
  • Month is often a good balance (date_trunc('month', ts)).
  • Week or day only if data volume is very high.
  • Keep partitions coarse enough that you don’t exceed 1000 partitions per table.
  • Drop old partitions instead of deleting rows (much faster & cheaper).
  1. Replication
  • Replicas = 1 in production for fault tolerance.
  • Replicas = 0 for dev/test or non-critical data.
  • More replicas improve read scaling but increase disk and write cost.
  1. Routing / Sharding Key
  • Default: CrateDB uses the primary key hash for shard routing.
  • You can specify CLUSTERED BY if you want custom routing (e.g. by customer ID).
  • For time-series, keep routing simple — partitioning already distributes well.
  1. Schema & Indexing
  • Use generated columns for partitioning keys (date_trunc).
  • Avoid over-indexing; each indexed column adds cost to writes and storage.
  1. Operational Guidance

-Estimate data volume growth and adjust #shards for new partitions as needed (via ALTER TABLE … SET (number_of_shards = …)).

  • Monitor shard sizes over time to detect growth hotspots.
  • Balance shards across nodes to avoid hotspots; CrateDB does this automatically, but adding/removing nodes may require REBALANCE.
  1. Monitoring & Maintenance
  • Track shard counts per node (system tables or Admin UI).
  • Keep an eye on:
    • Query latency for very large partitions.
    • Recovery/rebalance times if nodes restart.
  • Run regular benchmarks with realistic data + queries before finalizing partition/shard settings.
  1. Table creation guidelines
  • Depending on the character of the table, different sharding and partitioning conditions and recommendations are applicable. For example, a small-sized relatively static table without much traffic may not want to be replicated that often, and also sharding and partitioning it is likely not applicable.

    This statement has been added by @amotl, picked up from recent discussions, but needs to be verified before converging it into the docs.

In short

  • Shard size: 3–70 GB.
  • Partitions: monthly (adjust if needed).
  • Shards per node: <1000 (incl. replicas).
  • Replication: 1 for prod.
  • Routing: default unless strong reason otherwise.
  • Plan ahead — you can only adjust shard count for future partitions.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions