Skip to content

fix(devshard): align EvaluateValidationResponse with mainnet validation policy#1283

Open
0xMayoor wants to merge 2 commits into
gonka-ai:upgrade-v0.2.14from
0xMayoor:fix/devshard-validation-failopen
Open

fix(devshard): align EvaluateValidationResponse with mainnet validation policy#1283
0xMayoor wants to merge 2 commits into
gonka-ai:upgrade-v0.2.14from
0xMayoor:fix/devshard-validation-failopen

Conversation

@0xMayoor
Copy link
Copy Markdown
Contributor

EvaluateValidationResponse auto-passes in two cases where mainnet rejects.

Empty original logprobs short-circuits CompareLogits to 1.0 because customDistance returns 0 when the original is empty. Mainnet rejects when either side has no logits. An executor can serve a fabricated completion with no logprobs and every validator votes valid.

400 or 422 from the validator's own re-execution returns Valid:true before reading the body. The re-run uses executor-supplied enforced_tokens, so a malformed response pushes the backend into a 4xx and auto-wins. Mainnet has no 4xx auto-pass path at all.

Fix: reject the asymmetric empty-logits case (both-empty stays valid so reasoning-burn empties still pass), return Valid:false on 4xx. The 4xx is triggered by executor-controlled input so failing closed is the safer default.

Test fails on current code, passes with the fix.

PR #1282 logs this path but keeps the behavior unchanged.

@0xMayoor 0xMayoor closed this May 30, 2026
@0xMayoor 0xMayoor reopened this May 30, 2026
@tcharchian tcharchian requested review from a-kuprin and patimen June 2, 2026 00:33
@patimen
Copy link
Copy Markdown
Collaborator

patimen commented Jun 2, 2026

/run-integration

@a-kuprin
Copy link
Copy Markdown
Collaborator

a-kuprin commented Jun 2, 2026

It's not true that mainnet has no autopass. Here is the snippet from
decentralized-api/internal/validation/inference-validation.go (lines 984 - 999)

        // If the validator's inference node rejects the payload (400/422), treat validation as passed.
 	// This can happen when the original inference could not be executed due to upstream payload rejection,
	// and validators on older versions may still attempt re-execution.
	if resp.StatusCode == http.StatusBadRequest || resp.StatusCode == http.StatusUnprocessableEntity {
		logging.Warn("Validator inference node rejected payload; treating validation as passed", types.Validation,
			"inferenceId", inference.InferenceId,
			"status", resp.StatusCode,
			"body", string(respBodyBytes))
		return &SimilarityValidationResult{
			BaseValidationResult: BaseValidationResult{
				InferenceId:   inference.InferenceId,
				ResponseBytes: []byte{},
			},
			Value: 1.0,
		}, nil
	}

So the logic was that if we have evidence the executor cheated, the validator marked request as invalid, if validator cannot prove it, there was autopass.

I think we should keep mainnet logic for a while, and gather logs, to see are there situations to definitely mark some inferences that go through this path as invalid.

So let's keep logging there and autopass, according to mainnet semantics.

Guard before validationpkg.CompareLogits valid and should be kept. Mainnet has the guard and devshard is missing it

@0xMayoor
Copy link
Copy Markdown
Contributor Author

0xMayoor commented Jun 2, 2026

agreed @a-kuprin , reverting 4xx to autopass + warn.
keeping the empty-logits guard

one heads-up: mainnet's guard at :978 is || (errors if either side empty), mine is XOR(allows both-empty for the kimi-k2.6 reasoning-burn case from #1233).
Let me know if you'd rather match mainnet exactly.

@patimen
Copy link
Copy Markdown
Collaborator

patimen commented Jun 2, 2026

$/run-integrationnCodex

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants