Skip to content

[Hackathon] stanford-ml-phd: EigenTrust plugin for the trust layer#6

Open
mariagorskikh wants to merge 3 commits into
mainfrom
hackathon/stanford-ml-phd-eigentrust
Open

[Hackathon] stanford-ml-phd: EigenTrust plugin for the trust layer#6
mariagorskikh wants to merge 3 commits into
mainfrom
hackathon/stanford-ml-phd-eigentrust

Conversation

@mariagorskikh

@mariagorskikh mariagorskikh commented May 26, 2026

Copy link
Copy Markdown
Collaborator

Which piece and why

Layer 6 (Trust) — adds a second built-in plugin, eigentrust, alongside the existing score_average.

The default score_average plugin is the textbook naive baseline: every report counts equally and the reporter's identity is ignored. That's fine as scaffolding but it means the reputation scenario can't actually surface anything interesting about who is reporting whom — a Sybil clique can promote itself to the top in O(reports), so any protocol researcher comparing against score_average is fighting a strawman. As an ML/security researcher this is the layer where the bundled reference plugin most clearly limits what NEST can reveal.

Core idea

EigenTrust (Kamvar, Schlosser, Garcia-Molina; WWW '03) — the canonical "trust as graph centrality" algorithm and a frequent baseline in the multi-agent / P2P-reputation literature.

  • Maintain a sparse local-trust matrix C[reporter][subject] from Evidence reports (positive minus negative, then row-normalized).
  • Compute the global trust vector as the principal left eigenvector of the teleport-smoothed matrix via sparse power iteration: t ← (1 − α)·Cᵀ t + α·p.
  • p is a configurable pre-trusted seed distribution (defaults to uniform over observed agents); reporters with no usable local trust are treated as dangling and redistributed through p, mirroring the standard PageRank reformulation.
  • Cached lazily: report marks dirty, score recomputes on demand — so many score queries between reports are essentially free.
  • Deterministic under sorted agent ordering with a fixed iteration cap (64) and tolerance (1e-8). Same evidence sequence → bit-identical global-trust vector. NEST's "same seed → same trace" guarantee is preserved.

No new runtime dependencies (no NumPy); pure Python dict-of-dicts so cost scales with edges, not n².

How to test

Unit + property tests (16, all passing):

uv run pytest packages/nest-plugins-reference/tests/test_eigentrust.py -v

Headline adversarial properties covered:

  • test_sybil_clique_cannot_promote_itself — 10 Sybils circle-vouching 5× each cannot beat one honest agent with a single seed endorsement.
  • test_self_vouching_does_not_inflate — agents reporting themselves do not climb above a seed-anchored peer.
  • test_distrusted_reporter_cannot_swing_against_seed — 200 rogue negative reports + 200 rogue self-promotions still leave a seed-anchored target above the rogue.
  • test_eigentrust_separates_cheater_below_honest — miniature of the bundled reputation scenario: cheater ends strictly below every honest agent.
  • test_same_evidence_sequence_yields_identical_vector — determinism check (bit-equal floats across runs).

End-to-end swap against the reputation scenario:

# scenarios/reputation.yaml
layers:
  trust: eigentrust   # was: score_average
uv run nest run scenarios/reputation.yaml
uv run python -c "from pathlib import Path; from nest_core.validators import validate_trace; \
  [print(('PASS' if r.passed else 'FAIL'), r.name) for r in validate_trace(Path('traces/reputation.jsonl'), 'reputation')]"

Full workspace suite (regression check): uv run pytest packages/ -q275 passed (up from 259 with my 16 new tests).

Lint + types both clean: uv run ruff check ... and uv run pyright ... pass on the new files.

Key assumptions

  • α (teleport probability) defaults to 0.15, the PageRank canonical value; the paper recommends 0.1–0.2. Exposed as a constructor kwarg.
  • Without an explicit pre_trusted set, p is uniform over agents that have appeared as a reporter or subject. With explicit seeds that haven't appeared yet, the algorithm falls back to uniform rather than concentrating mass on absent agents.
  • Score is normalized by dividing by the max raw eigenvector entry, so the most-trusted agent gets ~1.0. This matches score_average's [0, 1] scale for fair side-by-side comparisons.
  • Unknown agents return the same neutral prior (0.5) as score_average, so baseline numbers don't shift on a plugin swap.
  • Trust.stake is kept as interface-compatible passthrough; staking does not influence the eigenvector in this implementation (could be added later via weighting p).
  • The reputation scenario's ObserverAgent currently keeps its own counter and does not call trust.score(), so the plugin's effect on validator output is via the trust-layer state, not the trace. A natural follow-up is to have the observer publish the global trust vector to the blackboard so validators can assert on rank-ordering.

Persona

Stanford ML PhD interested in adversarial multi-agent RL, benchmark design, and reproducibility — looking for plugins that turn NEST scenarios into proper protocol stress tests rather than smoke tests.

Future work

  • Time-weighted EigenTrust — apply an exponential decay to old reports before computing C (Kamvar 2003 §6, also explored in BetaReputation [Jøsang & Ismail]).
  • Stake-weighted seeds — fold stake() amounts into the seed distribution p so committed-stake agents anchor the system, giving NEST a free testbed for proof-of-stake reputation.
  • Validator extension — add a trust_ranking property check that compares the trust-layer global vector against ground-truth honesty labels in the reputation scenario (e.g., Spearman rank correlation between EigenTrust score and the actual cheat probability). That would turn the validator into a real adversarial benchmark.
  • EigenTrust on the marketplace scenario — the bundled marketplace doesn't currently exercise the trust layer; wiring buyer-side trust.score() lookups before accepting a seller would let researchers benchmark trust-driven partner selection.

https://claude.ai/code/session_01C5j2D4MgCkPgsjSCqBVpWW


Generated by Claude Code

Summary by Sourcery

Add a new EigenTrust-based trust plugin alongside the existing score_average implementation and document how to use it for Sybil-resistant reputation scoring.

New Features:

  • Introduce an EigenTrust trust plugin that computes transitive, Sybil-resistant global reputation scores from evidence reports.
  • Expose the EigenTrust plugin as a built-in trust-layer option that can be selected by name in scenarios.

Enhancements:

  • Extend trust-layer documentation to describe all bundled trust plugins and how to switch between them in scenarios.
  • Update the main README to mention EigenTrust as an additional built-in trust plugin at the Trust layer.

Tests:

  • Add a comprehensive EigenTrust test suite covering protocol contract behaviour, Sybil-resistance properties, seed handling, determinism, and convergence characteristics.

claude added 3 commits May 26, 2026 18:55
The default score_average trust plugin treats every report as equally
credible. A malicious clique can therefore inflate (or trash) any
agent's score by spamming reports.

EigenTrust (Kamvar, Schlosser, Garcia-Molina; WWW '03) weighs each
report by the reporter's current trust and teleports a fraction of
mass to a configurable pre-trusted seed set, bounding the influence
of any Sybil cluster. The implementation uses sparse power iteration
on the dict-of-dicts local-trust matrix; it is deterministic given
the same evidence sequence, preserving NEST's reproducibility
guarantee.

Registered as built-in 'eigentrust' under the trust layer.
Sixteen tests covering:

- Protocol contract parity with score_average (neutral prior for
  unknown agents, score in [0, 1], attest/stake passthrough).
- Sybil resistance: a self-vouching clique cannot promote itself
  above a seed-anchored honest agent; self-vouching cannot inflate;
  an untrusted reporter cannot trash a seed-anchored target.
- Pre-trusted seed propagation including the absent-seed fallback.
- Determinism: identical evidence sequence yields bit-identical
  global-trust vectors; the eigenvector sums to one; convergence
  within the iteration cap; recompute is lazy.
- Adversarial regression: in a miniature of the reputation scenario
  the cheater ranks strictly below every honest agent.
List both built-in trust plugins in docs/layers/trust.md with a
short Sybil-resistance comparison, and call out eigentrust in the
main README's 12-layer table.
@sourcery-ai

sourcery-ai Bot commented May 26, 2026

Copy link
Copy Markdown

Reviewer's Guide

Adds a new EigenTrust-based, Sybil-resistant trust plugin to the NEST reference plugins, wires it into the core plugin registry, documents how to use it, and provides a focused test suite validating protocol conformance, determinism, and adversarial properties.

Sequence diagram for EigenTrust score recomputation

sequenceDiagram
    actor ObserverAgent
    participant EigenTrust
    participant Recompute as _recompute

    ObserverAgent->>EigenTrust: report(agent, evidence)
    EigenTrust->>EigenTrust: set _dirty = True

    ObserverAgent->>EigenTrust: score(agent)
    alt [no agents present]
        EigenTrust-->>ObserverAgent: ReputationScore(score=0.5)
    else [agents present]
        opt [ _dirty ]
            EigenTrust->>Recompute: _recompute()
            Recompute-->>EigenTrust: update _global_trust, set _dirty = False
        end
        EigenTrust-->>ObserverAgent: ReputationScore(score=normalized _global_trust[agent] or 0.5)
    end
Loading

File-Level Changes

Change Details Files
Introduce EigenTrust trust plugin implementing EigenTrust/PageRank-style global trust over sparse report graphs.
  • Implement EigenTrust class maintaining sparse positive/negative local trust matrices keyed by reporter/subject and a set of known agents.
  • Compute global trust lazily via power iteration over a teleport-smoothed matrix with configurable alpha, iteration cap, and tolerance, caching results until new reports arrive.
  • Derive a seed distribution from an optional pre_trusted set or all observed agents, handling dangling reporters by redistributing their mass via the seed vector.
  • Expose scores via Trust-like async methods score, report, attest, and stake, including neutral default scores for unknown agents and normalisation to [0,1] based on max eigenvector entry.
  • Provide introspection helpers for tests/metrics (global_trust vector and iterations_last_run) while preserving deterministic behaviour via sorted agent ordering.
packages/nest-plugins-reference/nest_plugins_reference/trust/eigentrust.py
Add comprehensive tests validating EigenTrust behaviour, Sybil resistance, and determinism.
  • Add protocol contract tests to ensure neutral default scores, score bounds, attest/stake behaviour, and parameter validation for alpha.
  • Add adversarial tests for Sybil cliques, self-vouching, distrusted attackers versus seed-anchored targets, and transitive propagation from pre-trusted seeds.
  • Test seed handling when configured seeds are absent, ensure global trust sums to one, convergence occurs within a reasonable iteration cap, and recomputations are lazy across multiple score calls.
  • Verify determinism by asserting identical global trust vectors for identical evidence sequences and that EigenTrust ranks an adversarial agent below all honest agents in a miniature reputation scenario.
packages/nest-plugins-reference/tests/test_eigentrust.py
Wire EigenTrust into the plugin registry and update documentation to present it as a built-in trust plugin option.
  • Register the new ('trust', 'eigentrust') entry in the core plugin mapping to point at the EigenTrust implementation.
  • Update trust-layer documentation to describe both score_average and eigentrust, including a comparison table, source links, and guidance for swapping the trust plugin in the reputation scenario YAML.
  • Adjust the main README layer summary to mention eigentrust as an additional bundled trust plugin and clarify its transitive, Sybil-resistant nature.
  • Refine suggestions for custom trust plugins to emphasise stake-based, time-weighted, attestation-graph, and learned reputation models rather than EigenTrust itself (now provided).
packages/nest-core/nest_core/plugins.py
docs/layers/trust.md
README.md

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 2 issues

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="packages/nest-plugins-reference/nest_plugins_reference/trust/eigentrust.py" line_range="169-170" />
<code_context>
+        # Normalise so the most-trusted agent gets ~1.0 (raw eigenvector
+        # entries sum to 1, which makes them hard to compare to the
+        # reference plugin's [0, 1] scale).
+        max_val = max(self._global_trust.values()) or 1.0
+        score = raw / max_val
+
+        samples = self._sample_count.get(agent, 0)
</code_context>
<issue_to_address>
**suggestion (performance):** Avoid recomputing the normalisation max on every score() call.

This performs an O(n) scan of all trust values on every score() call. Since `_global_trust` only changes when `_dirty` triggers `_recompute()`, cache the max there (e.g. `self._max_global_trust`) and reuse it here to keep score() O(1).

Suggested implementation:

```python
        raw = self._global_trust.get(agent)
        if raw is None:
            # Agent we've never heard of: same neutral prior as the
            # reference plugin so validators that compare baselines
            # don't see a free win.
            return ReputationScore(agent_id=agent, score=0.5, confidence=0.0, sample_count=0)

        # Normalise so the most-trusted agent gets ~1.0 (raw eigenvector
        # entries sum to 1, which makes them hard to compare to the
        # reference plugin's [0, 1] scale).
        max_val = self._max_global_trust or 1.0
        score = raw / max_val

        samples = self._sample_count.get(agent, 0)

```

```python
        self._global_trust: dict[AgentId, float] = {}
        self._max_global_trust: float = 1.0

```

```python
        self._global_trust = global_trust
        self._max_global_trust = max(self._global_trust.values(), default=1.0)

```

If any of the SEARCH blocks do not match exactly (for example, if `_global_trust` is initialized without a type annotation, or `_recompute` assigns to `_global_trust` in a different way), adjust the SEARCH lines to match your local code. The important functional pieces are:

1. In `__init__` (or wherever `_global_trust` is first created), add:

```python
self._max_global_trust: float = 1.0
```

2. In the method that recomputes global trust (likely `_recompute`), immediately after assigning to `self._global_trust`, add:

```python
self._max_global_trust = max(self._global_trust.values(), default=1.0)
```

3. In `score()`, replace the `max(self._global_trust.values())` call with:

```python
max_val = self._max_global_trust or 1.0
```

so that `score()` becomes O(1) with respect to the number of agents.
</issue_to_address>

### Comment 2
<location path="packages/nest-plugins-reference/nest_plugins_reference/trust/eigentrust.py" line_range="236-247" />
<code_context>
+        # ``unknown`` kinds: counted but neutral.
+        self._dirty = True
+
+    async def stake(self, agent: AgentId, amount: int) -> None:
+        """Stake reputation on ``agent``'s good behaviour.
+
+        Kept for interface compatibility; staking does not influence
+        the EigenTrust eigenvector in this implementation, but the
+        amount is tracked so callers / validators can inspect it.
+
+        Example::
+
+            await trust.stake(AgentId("a1"), 100)
+        """
+        self._stakes[agent] = self._stakes.get(agent, 0) + amount
+
+    # ------------------------------------------------------- introspection
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Consider validating the stake amount to avoid negative or nonsensical values.

Because `stake()` is public, callers can currently pass negative values and effectively reduce an existing stake. If that isn’t intended, consider enforcing a non-negative `amount` (e.g., `amount > 0` or `>= 0`) and raising on invalid input so `_stakes` never contains unexpected negative values.

```suggestion
    async def stake(self, agent: AgentId, amount: int) -> None:
        """Stake reputation on ``agent``'s good behaviour.

        Kept for interface compatibility; staking does not influence
        the EigenTrust eigenvector in this implementation, but the
        amount is tracked so callers / validators can inspect it.

        Example::

            await trust.stake(AgentId("a1"), 100)
        """
        if amount <= 0:
            raise ValueError(f"stake amount must be positive, got {amount}")

        self._stakes[agent] = self._stakes.get(agent, 0) + amount
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +169 to +170
max_val = max(self._global_trust.values()) or 1.0
score = raw / max_val

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (performance): Avoid recomputing the normalisation max on every score() call.

This performs an O(n) scan of all trust values on every score() call. Since _global_trust only changes when _dirty triggers _recompute(), cache the max there (e.g. self._max_global_trust) and reuse it here to keep score() O(1).

Suggested implementation:

        raw = self._global_trust.get(agent)
        if raw is None:
            # Agent we've never heard of: same neutral prior as the
            # reference plugin so validators that compare baselines
            # don't see a free win.
            return ReputationScore(agent_id=agent, score=0.5, confidence=0.0, sample_count=0)

        # Normalise so the most-trusted agent gets ~1.0 (raw eigenvector
        # entries sum to 1, which makes them hard to compare to the
        # reference plugin's [0, 1] scale).
        max_val = self._max_global_trust or 1.0
        score = raw / max_val

        samples = self._sample_count.get(agent, 0)
        self._global_trust: dict[AgentId, float] = {}
        self._max_global_trust: float = 1.0
        self._global_trust = global_trust
        self._max_global_trust = max(self._global_trust.values(), default=1.0)

If any of the SEARCH blocks do not match exactly (for example, if _global_trust is initialized without a type annotation, or _recompute assigns to _global_trust in a different way), adjust the SEARCH lines to match your local code. The important functional pieces are:

  1. In __init__ (or wherever _global_trust is first created), add:
self._max_global_trust: float = 1.0
  1. In the method that recomputes global trust (likely _recompute), immediately after assigning to self._global_trust, add:
self._max_global_trust = max(self._global_trust.values(), default=1.0)
  1. In score(), replace the max(self._global_trust.values()) call with:
max_val = self._max_global_trust or 1.0

so that score() becomes O(1) with respect to the number of agents.

Comment on lines +236 to +247
async def stake(self, agent: AgentId, amount: int) -> None:
"""Stake reputation on ``agent``'s good behaviour.

Kept for interface compatibility; staking does not influence
the EigenTrust eigenvector in this implementation, but the
amount is tracked so callers / validators can inspect it.

Example::

await trust.stake(AgentId("a1"), 100)
"""
self._stakes[agent] = self._stakes.get(agent, 0) + amount

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (bug_risk): Consider validating the stake amount to avoid negative or nonsensical values.

Because stake() is public, callers can currently pass negative values and effectively reduce an existing stake. If that isn’t intended, consider enforcing a non-negative amount (e.g., amount > 0 or >= 0) and raising on invalid input so _stakes never contains unexpected negative values.

Suggested change
async def stake(self, agent: AgentId, amount: int) -> None:
"""Stake reputation on ``agent``'s good behaviour.
Kept for interface compatibility; staking does not influence
the EigenTrust eigenvector in this implementation, but the
amount is tracked so callers / validators can inspect it.
Example::
await trust.stake(AgentId("a1"), 100)
"""
self._stakes[agent] = self._stakes.get(agent, 0) + amount
async def stake(self, agent: AgentId, amount: int) -> None:
"""Stake reputation on ``agent``'s good behaviour.
Kept for interface compatibility; staking does not influence
the EigenTrust eigenvector in this implementation, but the
amount is tracked so callers / validators can inspect it.
Example::
await trust.stake(AgentId("a1"), 100)
"""
if amount <= 0:
raise ValueError(f"stake amount must be positive, got {amount}")
self._stakes[agent] = self._stakes.get(agent, 0) + amount

mariagorskikh added a commit that referenced this pull request May 26, 2026
Integration of 5 platform tracks built in parallel by specialist agents:

- platform/ci-hygiene (PR #12): Makefile + pre-commit + idempotent CI feedback bot + CONTRIBUTING Definition of Done
- platform/open-problems (PR #13): 10 differentiated open problems across 10 layers, charter, judging doc
- platform/judge-panel (PR #14): rubric, anthropic + openai providers, run_all CLI, real-diff fixture, live gpt-5.5 scoreboard for PRs #2-#11
- platform/research-harness (PR #15): conditions matrix, claude-CLI live runner, collect + analyze, dry-run fixtures + tests
- platform/marketplace-ui (PR #16): /hackathon Next.js section with author tags, judge scores, layer browser; Python data adapter

Schema reconciled end-to-end (rubric -> scores.json -> adapter -> TS types -> UI) on the 6-dim 1-5 scale with totals in [6, 30].

Local CI: 341 passed, 1 skipped (matplotlib gated), 1 deselected (live marker).

Live judge scoreboard top:
  #2  harvard-phd     trust       26.0/30  (EigenTrust + checkable invariants)
  #7  coinbase-crypto payments    26.0/30  (HTLC escrow)
  #6  stanford-ml-phd trust       25.0/30
  #11 google-staff    transport   25.0/30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants