Skip to content

Aida Stochastic SDb

Jack edited this page Sep 25, 2025 · 3 revisions

Aida Stochastic SDb

Overview

aida-stochastic-sdb runs stochastic workloads against the StateDB. The CLI bundles four tools:

  • generate builds a synthetic stats model with uniform parameters.
  • record collects stats from historical blocks by replaying recorded substates.
  • replay runs a Markov-chain driven workload against a chosen StateDB implementation.
  • visualize serves the collected stats through a local web UI for inspection.

Each command operates on a shared JSON structure (produced by generate/record) that contains operation frequencies, Markov transitions, and empirical distributions for arguments such as contracts, storage keys, values, and snapshots. replay consumes this file to drive state operations, and visualize reads the same data for plotting.

Workflow

Flowcharts

Two inputs flow into the shared stats.json artifact:

  • The Record branch ingests archived substates, streams them through the live executor, and captures actual access frequencies, transition probabilities, and argument distributions. Use this path when you want your workload to mirror historical chain behaviour.
  • The Generate branch starts from pure configuration defaults, sampling uniformly to bootstrap a synthetic model. This is useful for quick fuzzing or when you do not yet have real chain data available.

Regardless of origin, the consolidated stats.json powers the Replay command, which reproduces the state access patterns against any supported StateDB implementation (and optionally a shadow DB for validation).

Building the CLI

You need a Go toolchain (Go 1.22+) in PATH.

make aida-stochastic-sdb

The build creates ./build/aida-stochastic-sdb with all dependencies vendored automatically.

Command Reference

Generate

Uniform model bootstrap.

./build/aida-stochastic-sdb generate [flags]
  • Produces stats.json (override with --output).
  • Uses the distribution defaults defined in stochastic/generate.go.

The generated file contains the stochastic matrix and empirical CDFs needed by replay.

Flags

  • --log, -l: Set the logging verbosity (critical, error, warning, notice, info, debug). Default: info.
  • --output: Path where the generated stats file is written. Falls back to ./stats.json when omitted.
  • --block-length: Number of transactions per synthetic block. Default: 10.
  • --sync-period: Number of blocks per sync period in the generated workload. Default: 300.
  • --transaction-length: Controls the average number of operations inside a synthetic transaction. Default: 10.
  • --num-contracts: Size of the contract address pool sampled by the generator. Default: 1000.
  • --num-keys: Number of storage keys included in the generated distributions. Default: 1000.
  • --num-values: Number of storage values included in the generated distributions. Default: 1000.
  • --snapshot-depth: Depth of the snapshot history maintained while generating workloads. Default: 100.

Record

Extract stats from historical blocks.

./build/aida-stochastic-sdb record <firstBlock> <lastBlock> [flags]
  • Streams substates from --aida-db using the live DB executor.
  • Wraps the StateDB with a stochastic proxy to count operations and build distributions (stochastic/record.go).
  • Writes stats.json by default; override with --output.

Flags

  • --cpu-profile: Write a Go CPU profile to the provided path while recording.
  • --sync-period: Number of blocks per sync period used for aggregation. Default: 300.
  • --output: Destination file for collected stats. Defaults to ./stats.json when not set.
  • --workers, -w: Number of parallel substate readers. Default: 4.
  • --chainid: Chain ID inserted into the replay configuration stored in the stats file.
  • --aida-db: Path to the Aida substate/update/delete database. Required.
  • --cache: Cache size limit (in MiB) for StateDB or priming. Default: 8192.

Replay

Run the stochastic workload against a StateDB.

./build/aida-stochastic-sdb replay <blocks> <stats.json> [flags]
  • <blocks> controls the number of simulated blocks.
  • Instantiates the selected StateDB (geth by default, --db-impl, --db-variant).
  • Replays operations sampled from the Markov chain and empirical argument distributions.
  • Supports tracing (--trace, --trace-file, --trace-debug), profiling (--cpu-profile), memory reporting (--memory-breakdown), and validation through shadow DB flags.
  • RNG behaviour is reproducible with --random-seed; adjust argument ranges with --balance-range and --nonce-range.

Replay keeps temporary state under a scratch directory (override with --db-tmp) and removes it when finished.

Flags

  • --balance-range: Upper bound for randomly sampled account balances. Default: 1000000.
  • --carmen-schema: Carmen StateDB schema identifier to use when --db-impl targets Carmen. Default: 5.
  • --continue-on-failure: Keep running even when replay validation detects mismatches.
  • --cpu-profile: Write a Go CPU profile to the supplied path while replaying.
  • --debug-from: Emit trace-debug output starting from the given block number. Default: 0.
  • --memory-breakdown: Print a per-component memory usage summary after the run (if supported by the DB).
  • --nonce-range: Upper bound for random nonce sampling. Default: 1000000.
  • --random-seed: Seed for the pseudo-random generator (-1 picks a fresh seed). Default: -1.
  • --db-impl: Choose the primary StateDB implementation (geth, carmen, etc.). Default: geth.
  • --db-variant: Optional variant tag passed to the selected StateDB implementation.
  • --db-tmp: Directory used as scratch space for StateDB data; system temp dir if empty.
  • --db-logging: File path that receives low-level DB logging output.
  • --trace-file: Directory where tracing artifacts are stored. Default: ./.
  • --trace-debug: Enable verbose debug messages when tracing is active.
  • --trace: Turn on storage tracing and record operations.
  • --db-shadow-impl: StateDB implementation to run as a shadow for validation comparisons.
  • --db-shadow-variant: Variant tag used by the shadow StateDB implementation.
  • --log, -l: Logging verbosity for the replay run (criticaldebug). Default: info.

Visualize

Launch a local dashboard for the stats file.

./build/aida-stochastic-sdb visualize <stats.json> [--port PORT]
  • Reads the same JSON file emitted by generate/record.
  • Serves an HTTP UI at http://localhost:PORT (default 8080) showing operation frequencies, queuing/counting stats, Markov chains, etc.
  • Stop via Ctrl+C.

Flags

  • --port, -v: HTTP port for the visualization UI. Default: 8080.

Typical Workflows

Uniform fuzzing

./build/aida-stochastic-sdb generate --output stats.json
./build/aida-stochastic-sdb replay 100 stats.json --random-seed 42

Replay historical behaviour

./build/aida-stochastic-sdb record 17700000 17700999 --aida-db /data/aida --output stats.json
./build/aida-stochastic-sdb replay 100 stats.json --db-impl geth --balance-range 200000

Inspect stats interactively

./build/aida-stochastic-sdb visualize stats.json --port 8081

Clone this wiki locally