You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Sub-issue of #744. Before reading further, read #744 in full, including all of its comments, where the benchmarking tool's design is worked out in detail. This issue is one slice of that design and assumes the decisions recorded in those comments.
This is step 3 of 5. It depends on #783 (and on #782). This is where the #744 acceptance criteria are first met end-to-end.
Scope
inbox and webfinger scenarios
inbox (completing it from Benchmarking: fedify bench engine, scenario format, and JSON schema hosting #783 if partial): takes a recipient (a handle like acct:alice@host or an actor URI), not a path, since Fedify has no default paths. The inbox URL is discovered the way a real peer does it: WebFinger gives the actor URI, and the actor document gives inbox and endpoints.sharedInbox. An inbox mode of shared (the realistic default), personal, or an explicit URL. An embedObject flag distinguishes the pure inbox path from inbox-plus-dereference. Discovery is one-time setup excluded from the timed window.
webfinger: handle resolution over configurable handle sets, the discovery primitive the other scenarios reuse.
Client-side safety guard
Target tiers from the resolved address: loopback (127.0.0.0/8, ::1, localhost) and private (RFC 1918, link-local, .local) versus public.
At startup the tool probes GET /.well-known/fedify/bench/stats to detect whether the target advertises benchmarkMode, which is the operator's “not production” assertion.
Two tiers: Safe (target is loopback/private, or advertises benchmarkMode) runs with no friction; Caution (a public target without benchmarkMode) is refused unless --allow-unsafe-target is given.
--allow-unsafe-target is honored only together with an explicit --target. In CI or any non-TTY context the tool never prompts; the flag is mandatory there. A TTY may offer an interactive confirmation instead.
Scenario effect classes (read/write/deliver/fault) drive the warning text. --dry-run resolves discovery and reports the planned load without sending. On a public target, rate and duration must be set explicitly (no aggressive defaults).
expect gating
A scenario's expect thresholds are evaluated and the process exits non-zero on failure. Each entry carries a severity (warn or fail, default fail). The metric vocabulary and the per-type definition of success (for example which status codes count as success for inbox) are pinned alongside the schema.
Fixture app
An app under test/bench/ (in-memory KV, in-process queue, benchmarkMode, with the recipients the inbox scenario targets) that doubles as the local test server for the scenario tests.
Dependencies
Depends on #783 (the engine and scenario format) and #782 (the benchmarkMode target and the stats probe).
Acceptance criteria
A documented command runs against a local Fedify app and yields latency, throughput, success rate, and error summaries.
inbox (shared, signed) and webfinger work end-to-end, with inbox discovery via WebFinger.
JSON output suitable for CI comparison; expect exits non-zero on a fail-severity violation.
The guard refuses a public non-benchmarkMode target without --allow-unsafe-target, which is mandatory (not a prompt) in CI.
--dry-run resolves discovery and reports planned load without sending.
The test/bench/ fixture is used by the scenario tests.
Documentation
Add usage, safety guidance, and CI examples to docs/manual/benchmarking.md, and link it from docs/manual/deploy.md.
Note
Sub-issue of #744. Before reading further, read #744 in full, including all of its comments, where the benchmarking tool's design is worked out in detail. This issue is one slice of that design and assumes the decisions recorded in those comments.
This is step 3 of 5. It depends on #783 (and on #782). This is where the #744 acceptance criteria are first met end-to-end.
Scope
inboxandwebfingerscenariosinbox(completing it from Benchmarking:fedify benchengine, scenario format, and JSON schema hosting #783 if partial): takes arecipient(a handle likeacct:alice@hostor an actor URI), not a path, since Fedify has no default paths. The inbox URL is discovered the way a real peer does it: WebFinger gives the actor URI, and the actor document givesinboxandendpoints.sharedInbox. Aninboxmode ofshared(the realistic default),personal, or an explicit URL. AnembedObjectflag distinguishes the pure inbox path from inbox-plus-dereference. Discovery is one-time setup excluded from the timed window.webfinger: handle resolution over configurable handle sets, the discovery primitive the other scenarios reuse.Client-side safety guard
loopback(127.0.0.0/8,::1,localhost) andprivate(RFC 1918, link-local,.local) versuspublic.GET /.well-known/fedify/bench/statsto detect whether the target advertisesbenchmarkMode, which is the operator's “not production” assertion.loopback/private, or advertisesbenchmarkMode) runs with no friction; Caution (apublictarget withoutbenchmarkMode) is refused unless--allow-unsafe-targetis given.--allow-unsafe-targetis honored only together with an explicit--target. In CI or any non-TTY context the tool never prompts; the flag is mandatory there. A TTY may offer an interactive confirmation instead.read/write/deliver/fault) drive the warning text.--dry-runresolves discovery and reports the planned load without sending. On apublictarget,rateanddurationmust be set explicitly (no aggressive defaults).expectgatingexpectthresholds are evaluated and the process exits non-zero on failure. Each entry carries a severity (warnorfail, defaultfail). The metric vocabulary and the per-type definition of success (for example which status codes count as success forinbox) are pinned alongside the schema.Fixture app
benchmarkMode, with the recipients the inbox scenario targets) that doubles as the local test server for the scenario tests.Dependencies
Depends on #783 (the engine and scenario format) and #782 (the
benchmarkModetarget and thestatsprobe).Acceptance criteria
inbox(shared, signed) andwebfingerwork end-to-end, with inbox discovery via WebFinger.expectexits non-zero on afail-severity violation.publicnon-benchmarkModetarget without--allow-unsafe-target, which is mandatory (not a prompt) in CI.--dry-runresolves discovery and reports planned load without sending.Documentation
Add usage, safety guidance, and CI examples to docs/manual/benchmarking.md, and link it from docs/manual/deploy.md.