Benchmarking: required scenarios end-to-end with safety guard and CI gating

> [!NOTE]
> **Sub-issue of #744.** Before reading further, read #744 in full, including all of its comments, where the benchmarking tool's design is worked out in detail. This issue is one slice of that design and assumes the decisions recorded in those comments.

This is step 3 of 5. It depends on #783 (and on #782). This is where the #744 acceptance criteria are first met end-to-end.


Scope
-----

### `inbox` and `webfinger` scenarios

 -  `inbox` (completing it from #783 if partial): takes a `recipient` (a handle like `acct:alice@host` or an actor URI), not a path, since Fedify has no default paths. The inbox URL is discovered the way a real peer does it: WebFinger gives the actor URI, and the actor document gives `inbox` and `endpoints.sharedInbox`. An `inbox` mode of `shared` (the realistic default), `personal`, or an explicit URL. An `embedObject` flag distinguishes the pure inbox path from inbox-plus-dereference. Discovery is one-time setup excluded from the timed window.
 -  `webfinger`: handle resolution over configurable handle sets, the discovery primitive the other scenarios reuse.

### Client-side safety guard

 -  Target tiers from the resolved address: `loopback` (`127.0.0.0/8`, `::1`, `localhost`) and `private` (RFC 1918, link-local, `.local`) versus `public`.
 -  At startup the tool probes `GET /.well-known/fedify/bench/stats` to detect whether the target advertises `benchmarkMode`, which is the operator's “not production” assertion.
 -  Two tiers: Safe (target is `loopback`/`private`, or advertises `benchmarkMode`) runs with no friction; Caution (a `public` target without `benchmarkMode`) is refused unless `--allow-unsafe-target` is given.
 -  `--allow-unsafe-target` is honored only together with an explicit `--target`. In CI or any non-TTY context the tool never prompts; the flag is mandatory there. A TTY may offer an interactive confirmation instead.
 -  Scenario effect classes (`read`/`write`/`deliver`/`fault`) drive the warning text. `--dry-run` resolves discovery and reports the planned load without sending. On a `public` target, `rate` and `duration` must be set explicitly (no aggressive defaults).

### `expect` gating

 -  A scenario's `expect` thresholds are evaluated and the process exits non-zero on failure. Each entry carries a severity (`warn` or `fail`, default `fail`). The metric vocabulary and the per-type definition of success (for example which status codes count as success for `inbox`) are pinned alongside the schema.

### Fixture app

 -  An app under *test/bench/* (in-memory KV, in-process queue, `benchmarkMode`, with the recipients the inbox scenario targets) that doubles as the local test server for the scenario tests.


Dependencies
------------

Depends on #783 (the engine and scenario format) and #782 (the `benchmarkMode` target and the `stats` probe).


Acceptance criteria
-------------------

 -  [ ] A documented command runs against a local Fedify app and yields latency, throughput, success rate, and error summaries.
 -  [ ] `inbox` (shared, signed) and `webfinger` work end-to-end, with inbox discovery via WebFinger.
 -  [ ] JSON output suitable for CI comparison; `expect` exits non-zero on a `fail`-severity violation.
 -  [ ] The guard refuses a `public` non-`benchmarkMode` target without `--allow-unsafe-target`, which is mandatory (not a prompt) in CI.
 -  [ ] `--dry-run` resolves discovery and reports planned load without sending.
 -  [ ] The *test/bench/* fixture is used by the scenario tests.


Documentation
-------------

Add usage, safety guidance, and CI examples to *docs/manual/benchmarking.md*, and link it from *docs/manual/deploy.md*.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Benchmarking: required scenarios end-to-end with safety guard and CI gating #784

Scope

`inbox` and `webfinger` scenarios

Client-side safety guard

`expect` gating

Fixture app

Dependencies

Acceptance criteria

Documentation

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Benchmarking: required scenarios end-to-end with safety guard and CI gating #784

Description

Scope

inbox and webfinger scenarios

Client-side safety guard

expect gating

Fixture app

Dependencies

Acceptance criteria

Documentation

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

`inbox` and `webfinger` scenarios

`expect` gating