Decision orchestration and reconciliation for AI changes.
You bring all kinds of signals and your rules. Geval orchestrates and reconciles them into one outcome. No brain — just your rules applied, every time.
Watch the Geval demo on YouTube → — walkthrough of how Geval turns signals and policy rules into PASS, REQUIRE_APPROVAL, or BLOCK.
1. Download (pick your OS):
# Linux
curl -sSL https://github.com/geval-labs/geval/releases/latest/download/geval-linux-x86_64 -o geval && chmod +x geval
# macOS (Apple Silicon)
curl -sSL https://github.com/geval-labs/geval/releases/latest/download/geval-macos-aarch64 -o geval && chmod +x geval
# Windows (PowerShell) — see note below
Invoke-WebRequest -Uri https://github.com/geval-labs/geval/releases/latest/download/geval-windows-x86_64.exe -OutFile geval.exeWindows: Open PowerShell as Administrator (right‑click → Run as administrator). Then run the download command and
.\geval.exe demo.
2. Run the demo (no files needed):
./geval demo # Linux / macOS — use ./ so you run this binary, not another "geval" in PATH
.\geval.exe demo # Windows (same folder as geval.exe)You get a report and one outcome: PASS, REQUIRE_APPROVAL, or BLOCK — produced by the demo contract and signals. Use in CI →
No binary for your OS? Build from source.
If you see "unknown command 'init'" or "required option '--eval'" — you're running a different program named
geval(e.g. from npm or another install). Use the binary from Releases or build from source and run it with./geval(or put it first in your PATH).
Inside your project (your codebase is not changed except for one new folder), run the same binary you downloaded (e.g. ./geval):
./geval init # or: /path/to/geval initThis creates a .geval folder with:
- contract.yaml — Names your release gate, versions it, and lists policy files to evaluate.
- policies/ — Two starter files with descriptive names (
safety-and-blocking.yaml,quality-and-approval.yaml); edit rules to match your metrics. - signals.json — Example pipeline metrics; replace with your real signal names and values.
- README.md — What each file is for and how to run checks.
Then run:
./geval check --contract .geval/contract.yaml --signals .geval/signals.jsonUse a different folder: ./geval init my-rules. Overwrite existing files: ./geval init --force.
Use the same download commands. Replace your old file with the new one. Check version: geval --version.
You need a contract (one YAML that references one or more policy files) and a signals file. Geval evaluates each policy against the same signals, then combines outcomes (e.g. all must pass, or any block blocks). Use geval init for a template with a contract and two policies, or create the files yourself below.
All kinds of signals: Not every signal needs a score. You can mix: entries with a numeric value, and entries with no value (presence-only). Use a rule with operator: presence to match “this metric exists.” Details →
A list of evidence: what you measured, observed, or flagged. Each item has a metric (name). Value is optional — use it for scores; omit it for “this happened” (presence-only).
Example — save as mydata.json:
{
"signals": [
{ "metric": "accuracy", "value": 0.94 },
{ "metric": "engagement_drop", "value": 0.02 }
]
}You can add labels like component or system if you need them. Full example →
A contract is a YAML file that lists one or more policy files and a combination rule (how to merge their outcomes). Each policy file contains rules with unique priorities: When [condition on signals], then [pass / block / require_approval].
Prefer a form instead of writing YAML by hand? Use config.geval.io to generate Geval-compatible contract.yaml and policy files (download or copy), then validate with geval validate-contract and run geval check as below.
Example contract — save as contract.yaml:
name: my-gate
version: "1.0.0"
combine: worst_case
policies:
- path: policy.yamlExample policy — save as policy.yaml (path relative to the contract file):
name: quality
version: "1.0.0"
policy:
rules:
- priority: 1
name: block_bad_engagement
when:
metric: engagement_drop
operator: ">"
threshold: 0
then:
action: block
- priority: 2
name: allow_good_accuracy
when:
metric: accuracy
operator: ">="
threshold: 0.9
then:
action: passCombine (worst_case): any BLOCK wins; else any require_approval; else pass. Rule priorities must be unique per policy; 1 = highest; Geval records every match and the best priority wins. Operators: >, <, >=, <=, ==, presence. Actions: pass, block, require_approval.
./geval check --contract contract.yaml --signals mydata.json(Windows: .\geval.exe check --contract contract.yaml --signals mydata.json)
- PASS — Every policy passed (or combined rule says go).
- REQUIRE_APPROVAL — At least one policy requires approval.
- BLOCK — At least one policy blocks.
To see per-policy results and the combined decision:
./geval explain --contract contract.yaml --signals mydata.jsonTo validate the contract and all referenced policies:
./geval validate-contract contract.yamlThe problem • What Geval is • Commands • Docs
You have many signals: scores, A/B results, human reviews, flags. You change a model or a prompt. Then what?
- One signal says “better.”
- Another says “worse.”
- Someone asks: “Do we ship?”
Today that call happens in chat or a meeting. Hard to repeat. Hard to audit. You don't need a system that "decides" for you — you need orchestration and reconciliation: one place to define rules, one place to feed all your signals (not just numbers), and one deterministic outcome every time.
Geval is a decision orchestration and reconciliation engine. It does not make decisions. It has no brain. You provide:
- Your signals (one file) — any kind: scores, presence-only, flags, labels. Non-uniform is fine.
- Your rules (one file) — e.g. “If engagement drops, block. If accuracy is below X, need approval.”
Geval orchestrates the run and reconciles your signals against your rules in order. Same inputs + same rules = same outcome. It returns:
| Outcome | Meaning |
|---|---|
| PASS | No rule matched a block or require-approval. Good to go. |
| REQUIRE_APPROVAL | A rule matched; it says a person must approve first. |
| BLOCK | A rule matched; it says don’t ship. Fix first. |
Each run is recorded: which rules, which signals, when. So you can always answer: “Why did we ship?” and “Who approved?” — without any black box.
Run with ./geval (or ensure this repo’s binary is the one in your PATH):
| Command | What it does |
|---|---|
./geval demo |
Run the built-in example. Try this first. |
./geval init |
Create .geval/ with contract and policies. Edit and run. |
./geval check --contract <file> --signals <file> |
Evaluate contract → one outcome (PASS / REQUIRE_APPROVAL / BLOCK) |
./geval explain --contract <file> --signals <file> |
Per-policy results and combined decision report |
./geval validate-contract <file> |
Validate contract and all referenced policies |
./geval approve / ./geval reject |
Record a person’s approval or rejection |
| Guide | Description |
|---|---|
| Demo video (YouTube) | Walkthrough of Geval |
| Config generator (web) | Fill in forms → download contract.yaml and policies |
| Architecture | Contract = multiple policies + combine rule; module layout |
| Signals and rules | Non-uniform signals (scores, presence-only, mix); how rules use them |
| Signal assumptions | What we assume; what input forms we accept (number, string, trace, object) |
| Versioning | Contract, policy, and signals versioning; nothing unversioned |
| Extending | How to add a combination rule or change behavior; process and conventions |
| GitHub Actions | Use Geval in CI |
| Examples | Sample data and rules files |
| Customer demo (feature story) | Signals, policies, rules, and PASS/BLOCK/approval narrative for demos |
| Installation | Install, PATH, build from source |
| Developer workflow | PRs, check, approve/reject |
| Auditing | How decisions are recorded |
Contributions welcome. CONTRIBUTING.md. Build from source: Installation.
MIT © Geval Contributors
Website • Demo video • Config generator • Releases • GitHub