Skip to content

Adversarial CI gate lacks per category failure thresholds and deterministic regression fixtures #1313

@Bhanudahiyaa

Description

@Bhanudahiyaa

Description

The adversarial testing harness currently supports a global pass-rate gate, but CI policy enforcement can be stricter and more explainable with category-level thresholds and stable regression fixtures.

Current Behavior

  • CI gate mainly relies on PASS_RATE_MIN.
  • Per-category failure budgets are not enforced.
  • Regression inputs are not explicitly versioned as deterministic fixtures for CI policy auditing.

Why This Is a Problem

  • A suite can pass the global rate while still regressing in critical categories like secrets exfiltration.
  • It is harder to detect policy drift in a deterministic and reviewable way across runs.

Expected Behavior

  • Support global and category-specific fail thresholds in CI gate config.
  • Add stricter adversarial categories relevant to production policy risk.
  • Include deterministic regression fixtures for stable CI and reproducible failures.
  • Ensure tests validate YAML/fixture → suite → policy/gate behavior.

Proposed Implementation

  • Add categories for DataExfiltration and ToolPrivilegeEscalation.
  • Add deterministic adversarial fixture suite (regression_suite.json).
  • Extend CI gate config to support:
    • MAX_FAILURES
    • MAX_FAILURES_BY_CATEGORY (e.g. secrets_exfiltration=0,prompt_injection=0)
  • Update tests to cover category threshold failures and fixture determinism.
  • Wire workflow env vars in adversarial CI workflow.

Metadata

Metadata

Assignees

Labels

area/infraBuilds, release pipeline, CI infrastructuregithub_actionsPull requests that update GitHub Actions codekind/bugSomething is brokenpriority/p1High impactsecurity

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions