Skip to content

Replace legacy db_*_i64 intrinsics with KV intrinsics#269

Open
heifner wants to merge 1 commit intomasterfrom
feature/db-kv
Open

Replace legacy db_*_i64 intrinsics with KV intrinsics#269
heifner wants to merge 1 commit intomasterfrom
feature/db-kv

Conversation

@heifner
Copy link
Copy Markdown
Contributor

@heifner heifner commented Mar 22, 2026

Summary

This PR replaces the 60 legacy EOSIO db_*_i64 and db_idx* host functions with 22 new KV (key-value) database intrinsics. All contract storage now uses a unified key-value model instead of the legacy multi-table system inherited from EOSIO.

Companion CDT PR: Wire-Network/wire-cdt#41


Why

The legacy database layer is a complex subsystem inherited from EOSIO:

  • 60 host functions (10 primary + 50 secondary index) with intricate iterator caching, table_id indirection, and 6 different chainbase object types
  • Every contract operation requires a two-step lookup: find table_id_object → then find key_value_object within it
  • Five separate secondary index types (idx64, idx128, idx256, idx_double, idx_long_double), each with 10 host functions and its own chainbase object
  • The iterator cache maintains raw pointers into chainbase that can be invalidated by session rollback — a recurring source of upstream security fixes

Wire is a new chain that hasn't launched yet. We have the rare opportunity to replace this before any production state exists.


What Changed

Chain Library (22 new intrinsics replacing 60)

Category Old New
Primary CRUD db_store/update/remove/get_i64 kv_set, kv_get, kv_erase, kv_contains
Primary iteration db_find/next/previous/lowerbound/upperbound/end_i64 kv_it_create/destroy/status/next/prev/lower_bound/key/value
Secondary indices 50 functions across 5 types 12 unified kv_idx_* functions

Key Design Decisions

  1. SSO (Small String Optimization) — Keys ≤24 bytes stored inline in the chainbase object, avoiding heap allocation. The standard key layout [table:8B][scope:8B][pk:8B] = 24 bytes fits exactly in SSO.

  2. Integer fast-path comparator — 8-byte keys compared via bswap64 + integer comparison instead of memcmp. Endian-portable via kv_load_be64 (bswap on LE, identity on BE).

  3. Unified secondary index — One kv_index_object type replaces all 5 legacy secondary index types. Secondary keys stored as variable-length bytes with order-preserving encoding.

  4. key_format field — Each row has a key_format (0=raw, 1=standard 24-byte). SHiP uses this to deterministically decode keys back to legacy (code, scope, table, pk) format.

  5. kv_idx_find_secondary/kv_idx_lower_bound return int32_t — Returns -1 for not-found (no handle allocated), >= 0 for valid handle. kv_idx_lower_bound returns handle in iterator_end state when bound is past all entries but table is non-empty (enables kv_idx_prev for reverse iteration). Saves 1-2 host function calls per secondary lookup.

  6. Iterator cached-ID fast path — Iterators cache the chainbase object ID. On next/prev, the fast path does db.find(cached_id) + O(1) iterator_to + O(1) advance, avoiding the expensive composite-key lower_bound re-seek. Falls back to composite key lookup only when the row was erased mid-iteration. The key/value read functions also use the cached ID, eliminating composite index lookups for the common case. Per-step cost drops from 3 composite index lookups to 3 simple integer-key lookups (~3x faster constant factor).

RPC API

  • get_table_rows — Fully functional for both primary AND secondary index queries. Supports all key types: i64, i128, sha256/i256, float64, float128. Forward/reverse iteration, pagination with next_key, scope filtering.
  • get_kv_rows — New endpoint for querying format=0 (raw_table) entries with composite key decoding. Uses be_key_codec with NUL-escape string encoding.
  • get_table_by_scope — Reimplemented with KV prefix scanning.

SHiP Compatibility

  • Format=1 rows (standard 24-byte keys) emitted as "contract_row" — backward compatible with Hyperion
  • Format=0 rows (raw keys) emitted as "contract_row_kv" — new delta type, clients opt in
  • Breaking: contract_table deltas removed (tables are implicit in KV key prefixes). Secondary index deltas consolidated into contract_index_kv (generic bytes instead of typed fields). SHiP consumers that parse secondary index deltas or track table creation/removal events must be updated.

Deep Mind

  • Standard keys (format=1) emit DB_OP INS/UPD/REM — same format as legacy for backward compatibility
  • Raw keys (format=0) emit KV_OP INS/UPD/REM — new log format with hex-encoded keys
  • UPD correctly logs old_payer:new_payer (old payer captured before db.modify)

Snapshot Format

  • Snapshot DTOs use std::vector<char> instead of shared_blob — JSON snapshots encode key/value fields as hex (previously base64). Binary snapshot format is unchanged.
  • Pre-V7 snapshot loading removed (Wire starts from genesis, no legacy state to migrate)

CDT Changes

  • #include <sysio/multi_index.hpp> and sysio::multi_index still work — they forward to the KV implementation
  • Same for sysio::singleton
  • New sysio::kv::table for zero-copy high-performance contracts
  • New sysio::kv::raw_table for ordered key-value store with custom keys
  • Post-increment/decrement deleted on all iterators for performance (use ++it / --it)
  • be_key_stream supports all types: int8-128 (signed/unsigned), float, double, bool, name, NUL-escaped strings, vector

Pros

Simplicity

  • 60 → 22 host functions — 63% reduction in attack surface
  • 6 → 2 chainbase object typeskv_object + kv_index_object replace table_id_object + key_value_object + 5 secondary index types
  • No more table_id_object indirection — every lookup is direct
  • No more iterator_cache with raw pointers — KV iterators re-seek by key with cached-ID fast path

Security

  • Structurally prevents the most critical upstream vulnerability (cross-contract write via iterator reuse) — writes are keyed by receiver
  • Iterator re-seek design eliminates use-after-free bugs from session rollback
  • 95+ dedicated kv_api_tests + security audit document
  • All memcpy/memset calls in SSO assign functions guarded against null pointers with size checks

Performance

  • SSO keys avoid heap allocation for the common case
  • Eliminates table_id_object lookup on every operation
  • kv::table zero-copy path for trivially_copyable structs
  • Integer fast-path comparator for 8-byte keys (endian-portable)
  • Secondary index lookups save 1-2 host calls via int32_t return encoding
  • Cached-ID iterator fast path: next/prev/key/value use O(1) iterator_to instead of composite key re-seek

RAM Efficiency

  • sysio.token: 21% less RAM per balance row (184 bytes vs 232 legacy, no secondary index)
  • Eliminates per-scope table_id_object overhead (108 bytes per unique scope)

Cons

Breaking Change

  • Not backward compatible with legacy chain state — Wire hasn't launched, so acceptable
  • Pre-compiled WASMs with legacy db_* imports must be recompiled
  • SHiP schema changes: contract_table removed, secondary index deltas consolidated

Secondary Index Overhead

  • Per-row cost with secondary indices is ~7% higher due to SSO buffers in kv_index_object
  • Offset by eliminating table_id_object overhead

CDT Dependency

  • The CDT must be updated in lockstep — the wire-cdt feature/cdt-db-kv branch is required

@heifner heifner changed the title Replace legacy db_*_i64 intrinsics with KV database Replace legacy db_*_i64 intrinsics with KV intrinsics Mar 22, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Comment on lines +2 to +6
add_contract( bench_kv_db bench_kv_db bench_kv_db.cpp )
else()
configure_file( ${CMAKE_CURRENT_SOURCE_DIR}/bench_kv_db.wasm ${CMAKE_CURRENT_BINARY_DIR}/bench_kv_db.wasm COPYONLY )
configure_file( ${CMAKE_CURRENT_SOURCE_DIR}/bench_kv_db.abi ${CMAKE_CURRENT_BINARY_DIR}/bench_kv_db.abi COPYONLY )
endif()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering this is new code, shouldn't we be using GLOB[_RECURSE] file sets?

This applies to all new CMake files in the PR under unittests/test-contracts/*/CMakeLists.txt

Copy link
Copy Markdown
Collaborator

@jglanz jglanz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few changes

@heifner heifner force-pushed the feature/db-kv branch 5 times, most recently from e34729f to 3014888 Compare March 24, 2026 22:28
@heifner heifner force-pushed the feature/db-kv branch 3 times, most recently from fc29c55 to 14b9eb6 Compare March 28, 2026 03:54
…tioned index

Host-side KV database:
- 22 KV intrinsics replacing 60 legacy db_*_i64 intrinsics
- Format-partitioned by_code_key index: format=0 (raw_table) and format=1
  (multi_index/table) stored in separate partitions, preventing collisions
- key_format parameter added to kv_get, kv_erase, kv_contains, kv_it_create
- config::kv_format_raw and config::kv_format_standard constants

New API endpoint:
- /v1/chain/get_kv_rows for querying format=0 (raw_table) KV data
- Supports JSON key decoding via ABI key metadata, pagination, bounds, reverse
- be_key_codec for server-side BE key encoding/decoding

SysioTester build-tree support:
- Auto-discover libraries, includes, and vcpkg paths from uninstalled wire-sysio
- Enables CDT integration tests against build dir without install

All contracts recompiled with updated CDT intrinsic signatures.
Reference data regenerated (deep-mind log, snapshots, consensus blockchain).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants