[Hackathon] stanford-ml-phd: EigenTrust plugin for the trust layer#6
[Hackathon] stanford-ml-phd: EigenTrust plugin for the trust layer#6mariagorskikh wants to merge 3 commits into
Conversation
The default score_average trust plugin treats every report as equally credible. A malicious clique can therefore inflate (or trash) any agent's score by spamming reports. EigenTrust (Kamvar, Schlosser, Garcia-Molina; WWW '03) weighs each report by the reporter's current trust and teleports a fraction of mass to a configurable pre-trusted seed set, bounding the influence of any Sybil cluster. The implementation uses sparse power iteration on the dict-of-dicts local-trust matrix; it is deterministic given the same evidence sequence, preserving NEST's reproducibility guarantee. Registered as built-in 'eigentrust' under the trust layer.
Sixteen tests covering: - Protocol contract parity with score_average (neutral prior for unknown agents, score in [0, 1], attest/stake passthrough). - Sybil resistance: a self-vouching clique cannot promote itself above a seed-anchored honest agent; self-vouching cannot inflate; an untrusted reporter cannot trash a seed-anchored target. - Pre-trusted seed propagation including the absent-seed fallback. - Determinism: identical evidence sequence yields bit-identical global-trust vectors; the eigenvector sums to one; convergence within the iteration cap; recompute is lazy. - Adversarial regression: in a miniature of the reputation scenario the cheater ranks strictly below every honest agent.
List both built-in trust plugins in docs/layers/trust.md with a short Sybil-resistance comparison, and call out eigentrust in the main README's 12-layer table.
Reviewer's GuideAdds a new EigenTrust-based, Sybil-resistant trust plugin to the NEST reference plugins, wires it into the core plugin registry, documents how to use it, and provides a focused test suite validating protocol conformance, determinism, and adversarial properties. Sequence diagram for EigenTrust score recomputationsequenceDiagram
actor ObserverAgent
participant EigenTrust
participant Recompute as _recompute
ObserverAgent->>EigenTrust: report(agent, evidence)
EigenTrust->>EigenTrust: set _dirty = True
ObserverAgent->>EigenTrust: score(agent)
alt [no agents present]
EigenTrust-->>ObserverAgent: ReputationScore(score=0.5)
else [agents present]
opt [ _dirty ]
EigenTrust->>Recompute: _recompute()
Recompute-->>EigenTrust: update _global_trust, set _dirty = False
end
EigenTrust-->>ObserverAgent: ReputationScore(score=normalized _global_trust[agent] or 0.5)
end
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 2 issues
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location path="packages/nest-plugins-reference/nest_plugins_reference/trust/eigentrust.py" line_range="169-170" />
<code_context>
+ # Normalise so the most-trusted agent gets ~1.0 (raw eigenvector
+ # entries sum to 1, which makes them hard to compare to the
+ # reference plugin's [0, 1] scale).
+ max_val = max(self._global_trust.values()) or 1.0
+ score = raw / max_val
+
+ samples = self._sample_count.get(agent, 0)
</code_context>
<issue_to_address>
**suggestion (performance):** Avoid recomputing the normalisation max on every score() call.
This performs an O(n) scan of all trust values on every score() call. Since `_global_trust` only changes when `_dirty` triggers `_recompute()`, cache the max there (e.g. `self._max_global_trust`) and reuse it here to keep score() O(1).
Suggested implementation:
```python
raw = self._global_trust.get(agent)
if raw is None:
# Agent we've never heard of: same neutral prior as the
# reference plugin so validators that compare baselines
# don't see a free win.
return ReputationScore(agent_id=agent, score=0.5, confidence=0.0, sample_count=0)
# Normalise so the most-trusted agent gets ~1.0 (raw eigenvector
# entries sum to 1, which makes them hard to compare to the
# reference plugin's [0, 1] scale).
max_val = self._max_global_trust or 1.0
score = raw / max_val
samples = self._sample_count.get(agent, 0)
```
```python
self._global_trust: dict[AgentId, float] = {}
self._max_global_trust: float = 1.0
```
```python
self._global_trust = global_trust
self._max_global_trust = max(self._global_trust.values(), default=1.0)
```
If any of the SEARCH blocks do not match exactly (for example, if `_global_trust` is initialized without a type annotation, or `_recompute` assigns to `_global_trust` in a different way), adjust the SEARCH lines to match your local code. The important functional pieces are:
1. In `__init__` (or wherever `_global_trust` is first created), add:
```python
self._max_global_trust: float = 1.0
```
2. In the method that recomputes global trust (likely `_recompute`), immediately after assigning to `self._global_trust`, add:
```python
self._max_global_trust = max(self._global_trust.values(), default=1.0)
```
3. In `score()`, replace the `max(self._global_trust.values())` call with:
```python
max_val = self._max_global_trust or 1.0
```
so that `score()` becomes O(1) with respect to the number of agents.
</issue_to_address>
### Comment 2
<location path="packages/nest-plugins-reference/nest_plugins_reference/trust/eigentrust.py" line_range="236-247" />
<code_context>
+ # ``unknown`` kinds: counted but neutral.
+ self._dirty = True
+
+ async def stake(self, agent: AgentId, amount: int) -> None:
+ """Stake reputation on ``agent``'s good behaviour.
+
+ Kept for interface compatibility; staking does not influence
+ the EigenTrust eigenvector in this implementation, but the
+ amount is tracked so callers / validators can inspect it.
+
+ Example::
+
+ await trust.stake(AgentId("a1"), 100)
+ """
+ self._stakes[agent] = self._stakes.get(agent, 0) + amount
+
+ # ------------------------------------------------------- introspection
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Consider validating the stake amount to avoid negative or nonsensical values.
Because `stake()` is public, callers can currently pass negative values and effectively reduce an existing stake. If that isn’t intended, consider enforcing a non-negative `amount` (e.g., `amount > 0` or `>= 0`) and raising on invalid input so `_stakes` never contains unexpected negative values.
```suggestion
async def stake(self, agent: AgentId, amount: int) -> None:
"""Stake reputation on ``agent``'s good behaviour.
Kept for interface compatibility; staking does not influence
the EigenTrust eigenvector in this implementation, but the
amount is tracked so callers / validators can inspect it.
Example::
await trust.stake(AgentId("a1"), 100)
"""
if amount <= 0:
raise ValueError(f"stake amount must be positive, got {amount}")
self._stakes[agent] = self._stakes.get(agent, 0) + amount
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| max_val = max(self._global_trust.values()) or 1.0 | ||
| score = raw / max_val |
There was a problem hiding this comment.
suggestion (performance): Avoid recomputing the normalisation max on every score() call.
This performs an O(n) scan of all trust values on every score() call. Since _global_trust only changes when _dirty triggers _recompute(), cache the max there (e.g. self._max_global_trust) and reuse it here to keep score() O(1).
Suggested implementation:
raw = self._global_trust.get(agent)
if raw is None:
# Agent we've never heard of: same neutral prior as the
# reference plugin so validators that compare baselines
# don't see a free win.
return ReputationScore(agent_id=agent, score=0.5, confidence=0.0, sample_count=0)
# Normalise so the most-trusted agent gets ~1.0 (raw eigenvector
# entries sum to 1, which makes them hard to compare to the
# reference plugin's [0, 1] scale).
max_val = self._max_global_trust or 1.0
score = raw / max_val
samples = self._sample_count.get(agent, 0) self._global_trust: dict[AgentId, float] = {}
self._max_global_trust: float = 1.0 self._global_trust = global_trust
self._max_global_trust = max(self._global_trust.values(), default=1.0)If any of the SEARCH blocks do not match exactly (for example, if _global_trust is initialized without a type annotation, or _recompute assigns to _global_trust in a different way), adjust the SEARCH lines to match your local code. The important functional pieces are:
- In
__init__(or wherever_global_trustis first created), add:
self._max_global_trust: float = 1.0- In the method that recomputes global trust (likely
_recompute), immediately after assigning toself._global_trust, add:
self._max_global_trust = max(self._global_trust.values(), default=1.0)- In
score(), replace themax(self._global_trust.values())call with:
max_val = self._max_global_trust or 1.0so that score() becomes O(1) with respect to the number of agents.
| async def stake(self, agent: AgentId, amount: int) -> None: | ||
| """Stake reputation on ``agent``'s good behaviour. | ||
|
|
||
| Kept for interface compatibility; staking does not influence | ||
| the EigenTrust eigenvector in this implementation, but the | ||
| amount is tracked so callers / validators can inspect it. | ||
|
|
||
| Example:: | ||
|
|
||
| await trust.stake(AgentId("a1"), 100) | ||
| """ | ||
| self._stakes[agent] = self._stakes.get(agent, 0) + amount |
There was a problem hiding this comment.
suggestion (bug_risk): Consider validating the stake amount to avoid negative or nonsensical values.
Because stake() is public, callers can currently pass negative values and effectively reduce an existing stake. If that isn’t intended, consider enforcing a non-negative amount (e.g., amount > 0 or >= 0) and raising on invalid input so _stakes never contains unexpected negative values.
| async def stake(self, agent: AgentId, amount: int) -> None: | |
| """Stake reputation on ``agent``'s good behaviour. | |
| Kept for interface compatibility; staking does not influence | |
| the EigenTrust eigenvector in this implementation, but the | |
| amount is tracked so callers / validators can inspect it. | |
| Example:: | |
| await trust.stake(AgentId("a1"), 100) | |
| """ | |
| self._stakes[agent] = self._stakes.get(agent, 0) + amount | |
| async def stake(self, agent: AgentId, amount: int) -> None: | |
| """Stake reputation on ``agent``'s good behaviour. | |
| Kept for interface compatibility; staking does not influence | |
| the EigenTrust eigenvector in this implementation, but the | |
| amount is tracked so callers / validators can inspect it. | |
| Example:: | |
| await trust.stake(AgentId("a1"), 100) | |
| """ | |
| if amount <= 0: | |
| raise ValueError(f"stake amount must be positive, got {amount}") | |
| self._stakes[agent] = self._stakes.get(agent, 0) + amount |
Integration of 5 platform tracks built in parallel by specialist agents: - platform/ci-hygiene (PR #12): Makefile + pre-commit + idempotent CI feedback bot + CONTRIBUTING Definition of Done - platform/open-problems (PR #13): 10 differentiated open problems across 10 layers, charter, judging doc - platform/judge-panel (PR #14): rubric, anthropic + openai providers, run_all CLI, real-diff fixture, live gpt-5.5 scoreboard for PRs #2-#11 - platform/research-harness (PR #15): conditions matrix, claude-CLI live runner, collect + analyze, dry-run fixtures + tests - platform/marketplace-ui (PR #16): /hackathon Next.js section with author tags, judge scores, layer browser; Python data adapter Schema reconciled end-to-end (rubric -> scores.json -> adapter -> TS types -> UI) on the 6-dim 1-5 scale with totals in [6, 30]. Local CI: 341 passed, 1 skipped (matplotlib gated), 1 deselected (live marker). Live judge scoreboard top: #2 harvard-phd trust 26.0/30 (EigenTrust + checkable invariants) #7 coinbase-crypto payments 26.0/30 (HTLC escrow) #6 stanford-ml-phd trust 25.0/30 #11 google-staff transport 25.0/30
Which piece and why
Layer 6 (Trust) — adds a second built-in plugin,
eigentrust, alongside the existingscore_average.The default
score_averageplugin is the textbook naive baseline: every report counts equally and the reporter's identity is ignored. That's fine as scaffolding but it means thereputationscenario can't actually surface anything interesting about who is reporting whom — a Sybil clique can promote itself to the top in O(reports), so any protocol researcher comparing againstscore_averageis fighting a strawman. As an ML/security researcher this is the layer where the bundled reference plugin most clearly limits what NEST can reveal.Core idea
EigenTrust(Kamvar, Schlosser, Garcia-Molina; WWW '03) — the canonical "trust as graph centrality" algorithm and a frequent baseline in the multi-agent / P2P-reputation literature.C[reporter][subject]fromEvidencereports (positive minus negative, then row-normalized).t ← (1 − α)·Cᵀ t + α·p.pis a configurable pre-trusted seed distribution (defaults to uniform over observed agents); reporters with no usable local trust are treated as dangling and redistributed throughp, mirroring the standard PageRank reformulation.reportmarks dirty,scorerecomputes on demand — so manyscorequeries between reports are essentially free.No new runtime dependencies (no NumPy); pure Python dict-of-dicts so cost scales with edges, not n².
How to test
Unit + property tests (16, all passing):
Headline adversarial properties covered:
test_sybil_clique_cannot_promote_itself— 10 Sybils circle-vouching 5× each cannot beat one honest agent with a single seed endorsement.test_self_vouching_does_not_inflate— agents reporting themselves do not climb above a seed-anchored peer.test_distrusted_reporter_cannot_swing_against_seed— 200 rogue negative reports + 200 rogue self-promotions still leave a seed-anchored target above the rogue.test_eigentrust_separates_cheater_below_honest— miniature of the bundledreputationscenario: cheater ends strictly below every honest agent.test_same_evidence_sequence_yields_identical_vector— determinism check (bit-equal floats across runs).End-to-end swap against the
reputationscenario:Full workspace suite (regression check):
uv run pytest packages/ -q→ 275 passed (up from 259 with my 16 new tests).Lint + types both clean:
uv run ruff check ...anduv run pyright ...pass on the new files.Key assumptions
pre_trustedset,pis uniform over agents that have appeared as a reporter or subject. With explicit seeds that haven't appeared yet, the algorithm falls back to uniform rather than concentrating mass on absent agents.score_average's [0, 1] scale for fair side-by-side comparisons.score_average, so baseline numbers don't shift on a plugin swap.Trust.stakeis kept as interface-compatible passthrough; staking does not influence the eigenvector in this implementation (could be added later via weightingp).reputationscenario'sObserverAgentcurrently keeps its own counter and does not calltrust.score(), so the plugin's effect on validator output is via the trust-layer state, not the trace. A natural follow-up is to have the observer publish the global trust vector to the blackboard so validators can assert on rank-ordering.Persona
Stanford ML PhD interested in adversarial multi-agent RL, benchmark design, and reproducibility — looking for plugins that turn NEST scenarios into proper protocol stress tests rather than smoke tests.
Future work
stake()amounts into the seed distributionpso committed-stake agents anchor the system, giving NEST a free testbed for proof-of-stake reputation.trust_rankingproperty check that compares the trust-layer global vector against ground-truth honesty labels in thereputationscenario (e.g., Spearman rank correlation between EigenTrust score and the actual cheat probability). That would turn the validator into a real adversarial benchmark.trust.score()lookups before accepting a seller would let researchers benchmark trust-driven partner selection.https://claude.ai/code/session_01C5j2D4MgCkPgsjSCqBVpWW
Generated by Claude Code
Summary by Sourcery
Add a new EigenTrust-based trust plugin alongside the existing score_average implementation and document how to use it for Sybil-resistant reputation scoring.
New Features:
Enhancements:
Tests: