Geval

Decision orchestration and reconciliation for AI changes.

You bring all kinds of signals and your rules. Geval orchestrates and reconciles them into one outcome. No brain — just your rules applied, every time.

Demo video

Watch the Geval demo on YouTube → — walkthrough of how Geval turns signals and policy rules into PASS, REQUIRE_APPROVAL, or BLOCK.

Try it in under a minute

1. Download (pick your OS):

# Linux
curl -sSL https://github.com/geval-labs/geval/releases/latest/download/geval-linux-x86_64 -o geval && chmod +x geval

# macOS (Apple Silicon)
curl -sSL https://github.com/geval-labs/geval/releases/latest/download/geval-macos-aarch64 -o geval && chmod +x geval

# Windows (PowerShell) — see note below
Invoke-WebRequest -Uri https://github.com/geval-labs/geval/releases/latest/download/geval-windows-x86_64.exe -OutFile geval.exe

Windows: Open PowerShell as Administrator (right‑click → Run as administrator). Then run the download command and .\geval.exe demo.

2. Run the demo (no files needed):

./geval demo          # Linux / macOS — use ./ so you run this binary, not another "geval" in PATH
.\geval.exe demo      # Windows (same folder as geval.exe)

You get a report and one outcome: PASS, REQUIRE_APPROVAL, or BLOCK — produced by the demo contract and signals. Use in CI →

No binary for your OS? Build from source.

If you see "unknown command 'init'" or "required option '--eval'" — you're running a different program named geval (e.g. from npm or another install). Use the binary from Releases or build from source and run it with ./geval (or put it first in your PATH).

Start from a template (like create-react-app)

Inside your project (your codebase is not changed except for one new folder), run the same binary you downloaded (e.g. ./geval):

./geval init          # or: /path/to/geval init

This creates a .geval folder with:

contract.yaml — Names your release gate, versions it, and lists policy files to evaluate.
policies/ — Two starter files with descriptive names (safety-and-blocking.yaml, quality-and-approval.yaml); edit rules to match your metrics.
signals.json — Example pipeline metrics; replace with your real signal names and values.
README.md — What each file is for and how to run checks.

Then run:

./geval check --contract .geval/contract.yaml --signals .geval/signals.json

Use a different folder: ./geval init my-rules. Overwrite existing files: ./geval init --force.

Updating

Use the same download commands. Replace your old file with the new one. Check version: geval --version.

Use Geval with your own signals and contract

You need a contract (one YAML that references one or more policy files) and a signals file. Geval evaluates each policy against the same signals, then combines outcomes (e.g. all must pass, or any block blocks). Use geval init for a template with a contract and two policies, or create the files yourself below.

All kinds of signals: Not every signal needs a score. You can mix: entries with a numeric value, and entries with no value (presence-only). Use a rule with operator: presence to match “this metric exists.” Details →

Step 1: Your signals (data file)

A list of evidence: what you measured, observed, or flagged. Each item has a metric (name). Value is optional — use it for scores; omit it for “this happened” (presence-only).

Example — save as mydata.json:

{
  "signals": [
    { "metric": "accuracy", "value": 0.94 },
    { "metric": "engagement_drop", "value": 0.02 }
  ]
}

You can add labels like component or system if you need them. Full example →

Step 2: Your contract and policies

A contract is a YAML file that lists one or more policy files and a combination rule (how to merge their outcomes). Each policy file contains rules with unique priorities: When [condition on signals], then [pass / block / require_approval].

Prefer a form instead of writing YAML by hand? Use config.geval.io to generate Geval-compatible contract.yaml and policy files (download or copy), then validate with geval validate-contract and run geval check as below.

Example contract — save as contract.yaml:

name: my-gate
version: "1.0.0"
combine: worst_case
policies:
  - path: policy.yaml

Example policy — save as policy.yaml (path relative to the contract file):

name: quality
version: "1.0.0"
policy:
  rules:
    - priority: 1
      name: block_bad_engagement
      when:
        metric: engagement_drop
        operator: ">"
        threshold: 0
      then:
        action: block
    - priority: 2
      name: allow_good_accuracy
      when:
        metric: accuracy
        operator: ">="
        threshold: 0.9
      then:
        action: pass

Combine (worst_case): any BLOCK wins; else any require_approval; else pass. Rule priorities must be unique per policy; 1 = highest; Geval records every match and the best priority wins. Operators: >, <, >=, <=, ==, presence. Actions: pass, block, require_approval.

Full example → and policy →

Step 3: Run Geval

./geval check --contract contract.yaml --signals mydata.json

(Windows: .\geval.exe check --contract contract.yaml --signals mydata.json)

Step 4: Read the outcome

PASS — Every policy passed (or combined rule says go).
REQUIRE_APPROVAL — At least one policy requires approval.
BLOCK — At least one policy blocks.

To see per-policy results and the combined decision:

./geval explain --contract contract.yaml --signals mydata.json

To validate the contract and all referenced policies:

./geval validate-contract contract.yaml

The problem • What Geval is • Commands • Docs

The problem

You have many signals: scores, A/B results, human reviews, flags. You change a model or a prompt. Then what?

One signal says “better.”
Another says “worse.”
Someone asks: “Do we ship?”

Today that call happens in chat or a meeting. Hard to repeat. Hard to audit. You don't need a system that "decides" for you — you need orchestration and reconciliation: one place to define rules, one place to feed all your signals (not just numbers), and one deterministic outcome every time.

What Geval is

Geval is a decision orchestration and reconciliation engine. It does not make decisions. It has no brain. You provide:

Your signals (one file) — any kind: scores, presence-only, flags, labels. Non-uniform is fine.
Your rules (one file) — e.g. “If engagement drops, block. If accuracy is below X, need approval.”

Geval orchestrates the run and reconciles your signals against your rules in order. Same inputs + same rules = same outcome. It returns:

Outcome	Meaning
PASS	No rule matched a block or require-approval. Good to go.
REQUIRE_APPROVAL	A rule matched; it says a person must approve first.
BLOCK	A rule matched; it says don’t ship. Fix first.

Each run is recorded: which rules, which signals, when. So you can always answer: “Why did we ship?” and “Who approved?” — without any black box.

Commands

Run with ./geval (or ensure this repo’s binary is the one in your PATH):

Command	What it does
`./geval demo`	Run the built-in example. Try this first.
`./geval init`	Create .geval/ with contract and policies. Edit and run.
`./geval check --contract <file> --signals <file>`	Evaluate contract → one outcome (PASS / REQUIRE_APPROVAL / BLOCK)
`./geval explain --contract <file> --signals <file>`	Per-policy results and combined decision report
`./geval validate-contract <file>`	Validate contract and all referenced policies
`./geval approve` / `./geval reject`	Record a person’s approval or rejection

Documentation

Guide	Description
Demo video (YouTube)	Walkthrough of Geval
Config generator (web)	Fill in forms → download `contract.yaml` and policies
Architecture	Contract = multiple policies + combine rule; module layout
Signals and rules	Non-uniform signals (scores, presence-only, mix); how rules use them
Signal assumptions	What we assume; what input forms we accept (number, string, trace, object)
Versioning	Contract, policy, and signals versioning; nothing unversioned
Extending	How to add a combination rule or change behavior; process and conventions
GitHub Actions	Use Geval in CI
Examples	Sample data and rules files
Customer demo (feature story)	Signals, policies, rules, and PASS/BLOCK/approval narrative for demos
Installation	Install, PATH, build from source
Developer workflow	PRs, check, approve/reject
Auditing	How decisions are recorded

Contributing

Contributions welcome. CONTRIBUTING.md. Build from source: Installation.

License

Website • Demo video • Config generator • Releases • GitHub

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Geval

Demo video

Try it in under a minute

Start from a template (like create-react-app)

Updating

Use Geval with your own signals and contract

Step 1: Your signals (data file)

Step 2: Your contract and policies

Step 3: Run Geval

Step 4: Read the outcome

The problem

What Geval is

Commands

Documentation

Contributing

License

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Geval

Demo video

Try it in under a minute

Start from a template (like create-react-app)

Updating

Use Geval with your own signals and contract

Step 1: Your signals (data file)

Step 2: Your contract and policies

Step 3: Run Geval

Step 4: Read the outcome

The problem

What Geval is

Commands

Documentation

Contributing

License