Skip to content

feat(discovery): configurable RPC endpoint for agent discovery #240

@uibeka

Description

@uibeka

Problem

Agent discovery in getRegisteredAgentsByEvents() (and other functions in erc8004.ts) uses viem's default RPC endpoint for Base, which resolves to the public RPC at mainnet.base.org. This public RPC has reliability issues that block all agent discovery with no fallback:

Observed Feb 26: mainnet.base.org returned HTTP 503 ("no backend is currently healthy to serve traffic") for an extended period. eth_blockNumber queries worked, but eth_getLogs was dead. Every agent running on the Automaton framework lost the ability to discover other agents. There was no way for operators to configure an alternative RPC endpoint.

Impact

  • Complete discovery outage when Base public RPC has degraded service
  • No operator workaround — the RPC URL is determined by viem's chain defaults inside erc8004.ts
  • Affects all operators equally — everyone depends on the same hardcoded default
  • $10+ in wasted agent credits observed from a single outage (agents loop on failed discovery)

Proposed Solution

Allow operators to configure a custom RPC URL for agent discovery. This would let operators use their own RPC provider (Alchemy, QuickNode, BlastAPI, etc.) instead of relying on the public endpoint.

Design considerations for the maintainers:

Approach Pros Cons
Environment variable (AUTOMATON_RPC_URL) Simple, zero schema changes, operators set and forget May not match Conway's config patterns
automaton.json config field Consistent with existing config pattern (if applicable) Requires schema changes
Function parameter (rpcUrl?: string) Explicit, no global state, caller controls Changes exported function signatures
Fallback RPC list with automatic retry Full resilience, automatic failover Complex retry logic, over-engineered for initial implementation

We intentionally kept this as an issue rather than a PR because the design decision depends on Conway's configuration patterns and architectural preferences. Happy to implement whichever approach the team prefers.

Our Production Workaround (Reference)

We locally patched all three createPublicClient calls in erc8004.ts to use BlastAPI's free tier:

transport: http("https://base-mainnet.public.blastapi.io")

This immediately resolved the outage. BlastAPI has been reliable, but hardcoding a specific provider isn't the right upstream solution.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions