devshard v2 (v0.2.13-devshard-v2)#1289
Conversation
Co-authored-by: Cursor <cursoragent@cursor.com>
Sets DevshardEscrowParams.MaxEscrowsPerEpoch to 500_000.
Skip startup only when the port is set negative; treat 0 as unset and fall back to 9400. Wire the same default into the join compose file via NODE_MANAGER_GRPC_PORT so devshard reaches the API without manual config.
A participant restored to ACTIVE inherited the prior ConsecutiveInvalidInferences, so a single new failure could re-invalidate them immediately. Zero the counter when transitioning to INVALID and at every upcoming-to-effective promotion.
Replace the hardcoded keeper.DevshardMaxNonce constant with a governance parameter on DevshardEscrowParams. VerifyDevshardSettlement now receives the bound from params; the settle msg server reads it before verifying. The v0.2.13 upgrade handler raises MaxNonce to 1_000_000 and bundles the existing MaxEscrowsPerEpoch=500_000 bump into the same step.
…2.13 v0.2.12 added MsgRespondDealerComplaints to InferenceOperationKeyPerms but did not migrate existing cold-to-warm grants, leaving pre-v0.2.12 DAPIs unable to respond to dealer complaints. Walk authz grants, key each pair off its MsgStartInference grant, and add the missing authorization with the source grant's expiration. Idempotent.
Wire CreateUpgradeHandler with InferenceKeeper and AuthzKeeper so the chain runs the v0.2.13 migrations at the upgrade height. No module ConsensusVersion bump: the handler edits existing collections, no inference store schema change.
# Devshard storage: Postgres backend + epoch pruning
Drop-in replacement for the unbounded single-file SQLite store on `main`.
SQLite-only deployments need no config change; new binaries auto-migrate
the legacy DB on first boot.
## Architecture
```
HostManager
-> ManagedStorage // 30s pruner, retain N=3 epochs
-> SQLite // PGHOST unset
-> HybridStorage // PGHOST set
-> Postgres // primary, sticky per-escrow
-> SQLite // local fallback while PG is down
```
Storage is partitioned by `epoch_id` (= `DevshardEscrow.epoch_index`):
- Postgres: `devshard_sessions`, `devshard_diffs`, `devshard_signatures`
each `PARTITION BY RANGE (epoch_id)`. Partitions are created lazily;
pruning is `DROP TABLE`.
- SQLite: one `epoch_<N>.db` per epoch plus a `_meta.db` routing index;
pruning closes the pool and removes the file.
- Hybrid: per-escrow stickiness keeps a session on one backend.
`ManagedStorage` ticks every 30s, computes
`cutoff = max_observed_epoch + 1 - retain`, and prunes everything older.
An `EpochProvider` advances the cutoff on quiet hosts.
## Drop-in guarantees
- `PGHOST` unset -> SQLite-only, identical to before.
- `PGHOST` set -> hybrid mode, same env vars as `payloadstorage`.
- Legacy `/root/.dapi/data/devshard.db` is migrated to
`/root/.dapi/data/devshard/` on first boot, then renamed
`*.migrated.<unix>`. Idempotent across restarts.
- Per-host storage. No schema, proto, HTTP, or gossip changes.
## Tradeoffs
For simplicity, partitioning is by `epoch_id` only, not
`(epoch_id, escrow_id)`. Loading a session reads its diffs from the
shared epoch partition (indexed on `escrow_id`). The next step is per-escrow state snapshots (data +
additions) so readers skip the diff replay.
…poch Reuses the v0.2.10 grace-epoch primitive with UpgradeProtectionWindow=3000.
The pruning test queried latestEpoch at the very end and asserted that its session partition existed. But the advance-epochs loop exits via waitForNextEpoch after the last write, so by the time the assertion runs the chain's current epoch has no devshard activity and therefore no partition. Capture the epochIndex of the last tick's escrow during the loop and assert against that partition instead.
Problem:
API startup waited for devshard legacy migration and full session replay before
starting the ML/admin servers. On large devshard state this delayed port 9100 by
minutes even though most endpoints did not need recovered devshard sessions.
Solution:
Gate devshard session routes with a 503 initializing response, run legacy
migration in the background, then mark devshard ready and recover sessions
asynchronously. Requests after migration still lazily recover a single escrow
before serving it.
Flow:
startup -> register gated routes -> start servers
-> migrate legacy DB -> mark ready -> background recovery
request -> ready? no -> 503 initializing
request -> ready? yes -> session cached? yes -> serve
request -> ready? yes -> session cached? no -> recover escrow -> serve
* devshard snapshots for hosts * devshards recoversessions parallel workers * devshard host snapshot on settlement --------- Co-authored-by: David and Daniil Liberman <da@liberman.net>
|
Added #1326 that fixes found issue: Hosts could diverge from the user on SealedAcc / post_state_root because sealing used a wall-clock grace gate outside the signed diff |
* Move devshard inference sealing into deterministic state-machine auto-seal. Host-local wall-clock prune tiers made seal timing node-dependent and risked diverging state roots. Fold eligible inferences during diff apply using nonce and ConfirmedAt-derived state clock gates, and have the host emit payload-prune events only after the machine seals them. * Added short path for sealing inference: if inference is validated/invalidated don't wait grace period and seal it immidiately. Additional check before sealing inference has one of following statuses: StatusFinished, StatusValidated, StatusInvalidated, StatusTimedOut --------- Co-authored-by: akup <ak@neonavigation.com>
|
the nonce isn't bound to real work. in that's the same counter the downtime punishment reads ( create/settle is permissionless by default ( not prescribing a fix since that's your design, but the root is using the nonce-derived upper bound as the actual completed count — binding the credit to signed per-slot completed work (or cross-checking against |
|
two more verification gaps in the v2 runtime this PR ships — both the same "sibling verifies, twin doesn't" shape, and i've got fixes open against main for each:
both are still present on |
|
I've seen both and they are candidates for next release in 1 or 2 weeks. We just need to make this release finite |
* Parameters naming and inferenceSealGraceNonce, inferenceSealGraceTimeout moved to EscrowStart * Don't seal inferences when stateClock is undefined (no confirmedAt value in latest inferences)
It is at escrow start message and unchangable during escrow session Default is 150. It is required for e2e testermint test pass. That test checking autodealing works
| @@ -4,7 +4,7 @@ go 1.24.2 | |||
|
|
|||
| replace ( | |||
| cosmossdk.io/store => github.com/gonka-ai/cosmos-sdk/store v1.1.2-ps1 | |||
| github.com/cosmos/cosmos-sdk => github.com/gonka-ai/cosmos-sdk v0.53.3-ps17 | |||
| github.com/cosmos/cosmos-sdk => github.com/gonka-ai/cosmos-sdk v0.53.3-ps17-observability | |||
There was a problem hiding this comment.
Are we planning to make this include as a stable version, instead of a feature branch?
| @@ -788,8 +790,8 @@ github.com/golangci/revgrep v0.5.3 h1:3tL7c1XBMtWHHqVpS5ChmiAAoe4PF/d5+ULzV9sLAz | |||
| github.com/golangci/revgrep v0.5.3/go.mod h1:U4R/s9dlXZsg8uJmaR1GrloUr14D7qDl8gi2iPXJH8k= | |||
| github.com/golangci/unconvert v0.0.0-20240309020433-c5143eacb3ed h1:IURFTjxeTfNFP0hTEi1YKjB/ub8zkpaOqFFMApi2EAs= | |||
| github.com/golangci/unconvert v0.0.0-20240309020433-c5143eacb3ed/go.mod h1:XLXN8bNw4CGRPaqgl3bv/lhz7bsGPh4/xSaMTbo2vkQ= | |||
| github.com/gonka-ai/cosmos-sdk v0.53.3-ps17 h1:xw8ssDJDfl+/TnD9QMq/EZGzjnoh+6cvROqZE/MwNzU= | |||
| github.com/gonka-ai/cosmos-sdk v0.53.3-ps17/go.mod h1:90S054hIbadFB1MlXVZVC5w0QbKfd1P4b79zT+vvJxw= | |||
| github.com/gonka-ai/cosmos-sdk v0.53.3-ps17-observability h1:vWph4b1Xzvwj9jV3BVD6RXQLqRmCsGNyPAxePlFIU0Q= | |||
There was a problem hiding this comment.
Are we planning to make this include as a stable version, instead of a feature branch?
There was a problem hiding this comment.
stable version, not a feature branch.
Do you have any concerns on this?
There was a problem hiding this comment.
The naming v0.53.3-ps17-observability breaks semantic versioning.
Basically in devshard But again you are right that we should add |
|
yeah fair @a-kuprin , the active-phase fee bounds it so it's not free like i implied, my bad. |
Could you please dig it deeper and prepare PR with a fair fix? |
|
ok dug in, it's bigger than the -1 we landed on — a node that's genuinely down keeps its full epoch reward. empty diffs are the lever: they bump the nonce with no inference in them, so settlement credits the slot ~nonce/groupsize "completed" off pure nonce while Missed stays 0. run it to max, settle all-zero hoststats, and a 50-done/50-missed node that should be zeroed gets buried under ~1250 fake completed. no collusion needed either — stale mempool is the only thing that makes a host withhold its sig, and an empty session never has pending txs, so honest hosts sign it fine. costs ~2e7 in per-nonce fees, nothing next to the reward it saves. and it beats both downtime gates, not just the epoch one — the per-block slashing check too. devshard misses only hit that SPRT batched at settlement, so front-load the empty one and it never trips. inactive = zero reward, so that's the part that actually keeps the money. all confirmed with PoCs on the v2 head. finalize-window subtraction won't close it btw — the empty active nonces survive that, the count has to come from real work not the nonce. |
Actually empty diff is very undesirable by protocol and normally shouldn't appear. The another attack with empty diffs is skipping slots and send real work only to some choosen host in devshard. There are still a lot of work on devshards to make it stable on protocol level. I don't think we need to push it right now, as devshard now is primarily used for gateway stabilisation. So the main thing here is that empty diff is not what target version of protocol expects, but current state is mostly experimental for gateway purposes, and every gateway currently is from white list and is trusted Of course we can add real inference counting, but this should also be added to settlement message. From one side of view we will be changing part of protocol to legitimate host-skip attack I've described. But we anyway need legal way to skip inferences for some hosts (for example during cPoC see https://github.com/a-kuprin/gonka/blob/devshard-testenv/devshard/docs/proposals/CPOC_PROTOCOL.md) So adding real inference count to settlement message is what we should have, and is very easy to add |
yeah, attested per-slot count in the settlement message is the move — only thing i'd watch is it rides in the signed host_stats, not a value the settler passes in, otherwise you've just moved the trust. nonce stays as the cap. |
|
I found three settlement issues in devshard_settlement.go — seems all can lock or drain escrow funds
|
| sm.mu.Lock() | ||
| defer sm.mu.Unlock() | ||
|
|
||
| if err := sm.inferenceStore.DeleteSealedInferences(sm.state.EscrowID); err != nil { |
There was a problem hiding this comment.
I might be missing some context, but why do we prune sealed tables at startup?
This might cause data loss after restart in some scenarios
There was a problem hiding this comment.
It is the session recovery.
The source of truth is diffs here not the local data. Also sealed inferences are only for observability
There was a problem hiding this comment.
Not a blocker
interface.go:84 says these counters survive restarts, but sealed ones don't: recovery calls DeleteSealedInferences() (clears sealed_validation_obs) and never rebuilds them from diffs. So a session's sealed validation data is lost after restart
There was a problem hiding this comment.
DB write error here can mutate state but drop tx from diff which may cause checksum mismatches and reject diffs
Note: I don't know why, but github doesn't show the exact line
var applied []*types.DevshardTx
for _, tx := range txs {
if err := sm.applyTx(tx); err != nil {
if tx.GetStartInference() != nil {
sm.restoreMutable(snap)
return nil, nil, fmt.Errorf("mandatory start inference: %w", err)
}
continue // <--
}
applied = append(applied, tx)
...| SELECT c.relname | ||
| FROM pg_class c | ||
| JOIN pg_inherits i ON i.inhrelid = c.oid | ||
| JOIN pg_class p ON p.oid = i.inhparent | ||
| WHERE p.relname IN ('devshard_sessions', 'devshard_diffs', 'devshard_signatures', 'devshard_snapshots') |
There was a problem hiding this comment.
ensurePartition creates 8 partitions per epoch, but this pruneBefore query only lists 5 parents - leading to unbounded storage growth over time
There was a problem hiding this comment.
valid, should be fixed
| return c.JSON(http.StatusOK, []prometheusTargetGroup{}) | ||
| } | ||
|
|
||
| versions := s.configManager.GetDevshardVersions().Versions |
There was a problem hiding this comment.
Small fix to avoid a data race:
versions := slices.Clone(s.configManager.GetDevshardVersions().Versions)|
@0xMayoor @Ryanchen911 @x0152 For v0.2.13-v2, created the release from the current state, going to propose it today |
More like documentation missmatch
Yes it should be done and we should check it in settlement not from live parameter
Meaningless as colluding consensus can make a lot of bad things, like say there was a lot of inference. And we trust devshard hosts consensus |
This PR prepares the devshard v2 release.
This is the first devshard-only upgrade, which operates independently of usual chain upgrades. Once approved, v2 will run in parallel with the existing v1 devshard runtime.
See the upgrade design doc and the versioned/ package for details.
Upgrade process
devsharddbinary as a Gonka release artifactDevshardEscrowParams.approved_versions(defining the name, binary download URL, and sha256 hash)versiondautomatically downloads the binary and serves it under the/devshard/v2prefix/devshard/v2is available, contributors can test it before gateways switch primary traffic to v2No manual host steps are expected during this type of upgrade.
devshard
MsgFinishInferencetransactions so the sequencer can pick them up from another host's mempooldecentralized-api
The changes in the
decentralized-api/module are fully backward compatible and do not need to be activated before the next mainnet release.GetRuntimeConfiggRPC long-pollinference-chain
The changes in the
inference-chain/module are wire-compatible and do not need to be activated before the next mainnet release.versionfield tostate_root_and_protocol_versionin the devshard settlement message protoDevshardEscrowParamscreate_devshard_feeandfee_per_noncetoDevshardEscrowto snapshot active fees at escrow creationdeploy
Proposed Bounties