From 5c79da02d6e912e9b0e8c11ef749e9de94f593ac Mon Sep 17 00:00:00 2001
From: Nigel Jones <jonesn@uk.ibm.com>
Date: Wed, 18 Mar 2026 11:25:14 +0000
Subject: [PATCH 01/13] docs: condense README to elevator pitch (#478)

---
 README.md | 206 +++++++++++-------------------------------------------
 1 file changed, 39 insertions(+), 167 deletions(-)
diff --git a/README.md b/README.md
index ff5a7682e..442d8fa55 100644
--- a/README.md
+++ b/README.md
@@ -1,14 +1,15 @@
-<img src="https://github.com/generative-computing/mellea/raw/main/docs/mellea_draft_logo_300.png" height=100>
+<img src="https://github.com/generative-computing/mellea/raw/main/docs/mellea_draft_logo_300.png" alt="Mellea logo" height=100>
 
-# Mellea
-
-Mellea is a library for writing generative programs.
-Generative programming replaces flaky agents and brittle prompts
-with structured, maintainable, robust, and efficient AI workflows.
+# Mellea — build predictable AI without guesswork
 
+Inside every AI-powered pipeline, the unreliable part is the same: the LLM call itself.
+Silent failures, untestable outputs, no guarantees.
+Mellea wraps those calls in Python you can read, test, and reason about —
+type-annotated outputs, verifiable requirements, automatic retries.
 
 [//]: # ([![arXiv]&#40;https://img.shields.io/badge/arXiv-2408.09869-b31b1b.svg&#41;]&#40;https://arxiv.org/abs/2408.09869&#41;)
-[![Docs](https://img.shields.io/badge/docs-live-brightgreen)](https://docs.mellea.ai/)
+[![Website](https://img.shields.io/badge/website-mellea.ai-blue)](https://mellea.ai/)
+[![Docs](https://img.shields.io/badge/docs-docs.mellea.ai-brightgreen)](https://docs.mellea.ai/)
 [![PyPI version](https://img.shields.io/pypi/v/mellea)](https://pypi.org/project/mellea/)
 [![PyPI - Python Version](https://img.shields.io/pypi/pyversions/mellea)](https://pypi.org/project/mellea/)
 [![uv](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/uv/main/assets/badge/v0.json)](https://github.com/astral-sh/uv)
@@ -18,189 +19,60 @@ with structured, maintainable, robust, and efficient AI workflows.
 [![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-3.0-4baaaa.svg)](CODE_OF_CONDUCT.md)
 [![Discord](https://img.shields.io/discord/1448407063813165219?logo=discord&logoColor=white&label=Discord&color=7289DA)](https://ibm.biz/mellea-discord)
 
-
-## Features
-
- * A standard library of opinionated prompting patterns.
- * Sampling strategies for inference-time scaling.
- * Clean integration between verifiers and samplers.
-    - Batteries-included library of verifiers.
-    - Support for efficient checking of specialized requirements using
-      activated LoRAs.
-    - Train your own verifiers on proprietary classifier data.
- * Compatible with many inference services and model families. Control cost
-   and quality by easily lifting and shifting workloads between:
-        - inference providers
-        - model families
-        - model sizes
- * Easily integrate the power of LLMs into legacy code-bases (mify).
- * Sketch applications by writing specifications and letting `mellea` fill in
-   the details (generative slots).
- * Get started by decomposing your large unwieldy prompts into structured and maintainable mellea problems.
-
-
-
-## Getting Started
-
-You can get started with a local install, or by using Colab notebooks.
-
-### Getting Started with Local Inference
-
-<img src="https://github.com/generative-computing/mellea/raw/main/docs/GetStarted_py.png" style="max-width:800px">
-
-Install with [uv](https://docs.astral.sh/uv/getting-started/installation/):
+## Install
 
 ```bash
 uv pip install mellea
 ```
 
-Install with pip:
+See [installation docs](https://docs.mellea.ai/getting-started/installation) for extras (`[hf]`, `[watsonx]`, `[docling]`, `[all]`, …) and source installation.
 
-```bash
-pip install mellea
-```
-
-> [!NOTE]
-> `mellea` comes with some additional packages as defined in our `pyproject.toml`. If you would like to install all the extra optional dependencies, please run the following commands:
->
-> ```bash
-> uv pip install "mellea[hf]" # for Huggingface extras and Alora capabilities
-> uv pip install "mellea[watsonx]" # for watsonx backend
-> uv pip install "mellea[docling]" # for docling
-> uv pip install "mellea[smolagents]" # for HuggingFace smolagents tools
-> uv pip install "mellea[all]" # for all the optional dependencies
-> ```
->
-> You can also install all the optional dependencies with `uv sync --all-extras`
-
-> [!NOTE]
-> If running on an Intel mac, you may get errors related to torch/torchvision versions. Conda maintains updated versions of these packages. You will need to create a conda environment and run `conda install 'torchvision>=0.22.0'` (this should also install pytorch and torchvision-extra). Then, you should be able to run `uv pip install mellea`. To run the examples, you will need to use `python <filename>` inside the conda environment instead of `uv run --with mellea <filename>`.
-
-> [!NOTE]
-> If you are using python >= 3.13, you may encounter an issue where outlines cannot be installed due to rust compiler issues (`error: can't find Rust compiler`). You can either downgrade to python 3.12 or install the [rust compiler](https://www.rust-lang.org/tools/install) to build the wheel for outlines locally.
-
-For running a simple LLM request locally (using Ollama with Granite model), this is the starting code:
-```python
-# file: https://github.com/generative-computing/mellea/blob/main/docs/examples/tutorial/example.py
-import mellea
-
-m = mellea.start_session()
-print(m.chat("What is the etymology of mellea?").content)
-```
-
-
-Then run it:
-> [!NOTE]
-> Before we get started, you will need to download and install [ollama](https://ollama.com/). Mellea can work with many different types of backends, but everything in this tutorial will "just work" on a Macbook running IBM's Granite 4 Micro 3B model.
-```shell
-uv run --with mellea docs/examples/tutorial/example.py
-```
-
-### Get Started with Colab
-
-| Notebook | Try in Colab | Goal |
-|----------|--------------|------|
-| Hello, World | <a target="_blank" rel="noopener noreferrer" href="https://colab.research.google.com/github/generative-computing/mellea/blob/main/docs/examples/notebooks/example.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a> | Quick‑start demo |
-| Simple Email | <a target="_blank" rel="noopener noreferrer" href="https://colab.research.google.com/github/generative-computing/mellea/blob/main/docs/examples/notebooks/simple_email.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a> | Using the `m.instruct` primitive |
-| Instruct-Validate-Repair | <a target="_blank" rel="noopener noreferrer" href="https://colab.research.google.com/github/generative-computing/mellea/blob/main/docs/examples/notebooks/instruct_validate_repair.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a> | Introduces our first generative programming design pattern |
-| Model Options | <a target="_blank" rel="noopener noreferrer" href="https://colab.research.google.com/github/generative-computing/mellea/blob/main/docs/examples/notebooks/model_options_example.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a> | Demonstrates how to pass model options through to backends |
-| Sentiment Classifier | <a target="_blank" rel="noopener noreferrer" href="https://colab.research.google.com/github/generative-computing/mellea/blob/main/docs/examples/notebooks/sentiment_classifier.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a> | Introduces the `@generative` decorator |
-| Managing Context | <a target="_blank" rel="noopener noreferrer" href="https://colab.research.google.com/github/generative-computing/mellea/blob/main//docs/examples/notebooks/context_example.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a> | Shows how to construct and manage context in a `MelleaSession` |
-| Generative OOP | <a target="_blank" rel="noopener noreferrer" href="https://colab.research.google.com/github/generative-computing/mellea/blob/main/docs/examples/notebooks/table_mobject.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a> | Demonstrates object-oriented generative programming in Mellea |
-| Rich Documents | <a target="_blank" rel="noopener noreferrer" href="https://colab.research.google.com/github/generative-computing/mellea/blob/main/docs/examples/notebooks/document_mobject.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a> | A generative program that uses Docling to work with rich-text documents |
-| Composing Generative Functions | <a target="_blank" rel="noopener noreferrer" href="https://colab.research.google.com/github/generative-computing/mellea/blob/main/docs/examples/notebooks/compositionality_with_generative_slots.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a> | Demonstrates contract-oriented programming in Mellea |
-| `m serve` | <a target="_blank" rel="noopener noreferrer" href="https://colab.research.google.com/github/generative-computing/mellea/blob/main/docs/examples/notebooks/m_serve_example.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a> | Serve a generative program as an openai-compatible model endpoint |
-| MCP | <a target="_blank" rel="noopener noreferrer" href="https://colab.research.google.com/github/generative-computing/mellea/blob/main/docs/examples/notebooks/mcp_example.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a> | Mellea + MCP |
-
-
-### Installing from Source
-
-If you want to contribute to Mellea or need the latest development version, see the
-[Getting Started](CONTRIBUTING.md#getting-started) section in our Contributing Guide for
-detailed installation instructions.
-
-## Getting started with validation
-
-Mellea supports validation of generation results through a **instruct-validate-repair** pattern.
-Below, the request for *"Write an email.."* is constrained by the requirements of *"be formal"* and *"Use 'Dear interns' as greeting."*.
-Using a simple rejection sampling strategy, the request is sent up to three (loop_budget) times to the model and
-the output is checked against the constraints using (in this case) LLM-as-a-judge.
-
-
-```python
-# file: https://github.com/generative-computing/mellea/blob/main/docs/examples/instruct_validate_repair/101_email_with_validate.py
-from mellea import MelleaSession
-from mellea.backends import ModelOption
-from mellea.backends.ollama import OllamaModelBackend
-from mellea.backends import model_ids
-from mellea.stdlib.sampling import RejectionSamplingStrategy
-
-# create a session with Mistral running on Ollama
-m = MelleaSession(
-    backend=OllamaModelBackend(
-        model_id=model_ids.MISTRALAI_MISTRAL_0_3_7B,
-        model_options={ModelOption.MAX_NEW_TOKENS: 300},
-    )
-)
-
-# run an instruction with requirements
-email_v1 = m.instruct(
-    "Write an email to invite all interns to the office party.",
-    requirements=["be formal", "Use 'Dear interns' as greeting."],
-    strategy=RejectionSamplingStrategy(loop_budget=3),
-)
-
-# print result
-print(f"***** email ****\n{str(email_v1)}\n*******")
-```
-
-
-## Getting Started with Generative Slots
-
-Generative slots allow you to define functions without implementing them.
-The `@generative` decorator marks a function as one that should be interpreted by querying an LLM.
-The example below demonstrates how an LLM's sentiment classification
-capability can be wrapped up as a function using Mellea's generative slots and
-a local LLM.
+## Example
 
+The `@generative` decorator turns a typed Python function into a structured LLM call.
+Docstrings become prompts, type hints become schemas — no parsers, no chains:
 
 ```python
-# file: https://github.com/generative-computing/mellea/blob/main/docs/examples/tutorial/sentiment_classifier.py#L1-L13
-from typing import Literal
+from pydantic import BaseModel
 from mellea import generative, start_session
 
+class UserProfile(BaseModel):
+    name: str
+    age: int
 
 @generative
-def classify_sentiment(text: str) -> Literal["positive", "negative"]:
-  """Classify the sentiment of the input text as 'positive' or 'negative'."""
+def extract_user(text: str) -> UserProfile:
+    """Extract the user's name and age from the text."""
 
-
-if __name__ == "__main__":
-  m = start_session()
-  sentiment = classify_sentiment(m, text="I love this!")
-  print("Output sentiment is:", sentiment)
+m = start_session()
+user = extract_user(m, text="User log 42: Alice is 31 years old.")
+print(user.name)  # Alice
+print(user.age)   # 31 — always an int, guaranteed by the schema
 ```
 
+## Learn More
 
-## Contributing
+| Resource | |
+|---|---|
+| [mellea.ai](https://mellea.ai) | Vision, features, and live demos |
+| [docs.mellea.ai](https://docs.mellea.ai) | Full docs — tutorials, API reference, how-to guides |
+| [Colab notebooks](docs/examples/notebooks/) | Interactive examples you can run immediately |
+| [Code examples](docs/examples/) | Runnable examples: RAG, agents, IVR, MObjects, and more |
 
-We welcome contributions to Mellea! There are several ways to contribute:
+## Contributing
 
-1. **Contributing to this repository** - Core features, bug fixes, standard library components
-2. **Applications & Libraries** - Build tools using Mellea (host in your own repo with `mellea-` prefix)
-3. **Community Components** - Contribute to [mellea-contribs](https://github.com/generative-computing/mellea-contribs)
+We welcome contributions of all kinds — bug fixes, new backends, standard library components, examples, and docs.
 
-Please see our **[Contributing Guide](CONTRIBUTING.md)** for detailed information on:
-- Getting started with development
-- Coding standards and workflow
-- Testing guidelines
-- How to contribute specific types of components
+- **[Contributing Guide](https://docs.mellea.ai/community/contributing-guide)** — development setup, workflow, and coding standards
+- **[Building Extensions](https://docs.mellea.ai/community/building-extensions)** — create reusable components in your own repo
+- **[mellea-contribs](https://github.com/generative-computing/mellea-contribs)** — community library for shared components
 
-Questions? Join our [Discord](https://ibm.biz/mellea-discord)!
+Questions? Join our [Discord](https://ibm.biz/mellea-discord).
 
 ### IBM ❤️ Open Source AI
 
-Mellea has been started by IBM Research in Cambridge, MA.
-
+Mellea was started by IBM Research in Cambridge, MA.
 
+---
 
+Licensed under the [Apache-2.0 License](LICENSE). Copyright © 2026 Mellea.

From 611358528b470a2901e1d2e9f8c83b0fc37f788d Mon Sep 17 00:00:00 2001
From: Nigel Jones <jonesn@uk.ibm.com>
Date: Wed, 18 Mar 2026 11:25:48 +0000
Subject: [PATCH 02/13] docs: link contributing guide to CONTRIBUTING.md

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 442d8fa55..464cf8627 100644
--- a/README.md
+++ b/README.md
@@ -63,7 +63,7 @@ print(user.age)   # 31 — always an int, guaranteed by the schema
 
 We welcome contributions of all kinds — bug fixes, new backends, standard library components, examples, and docs.
 
-- **[Contributing Guide](https://docs.mellea.ai/community/contributing-guide)** — development setup, workflow, and coding standards
+- **[Contributing Guide](CONTRIBUTING.md)** — development setup, workflow, and coding standards
 - **[Building Extensions](https://docs.mellea.ai/community/building-extensions)** — create reusable components in your own repo
 - **[mellea-contribs](https://github.com/generative-computing/mellea-contribs)** — community library for shared components
 

From 5b291d07b3557097b5c10928f3ce4769ad4980ce Mon Sep 17 00:00:00 2001
From: Nigel Jones <jonesn@uk.ibm.com>
Date: Wed, 18 Mar 2026 11:29:12 +0000
Subject: [PATCH 03/13] docs: fix license badge link, vision statement, IVR
 spelling, wording tweaks

---
 README.md | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/README.md b/README.md
index 464cf8627..1443f6e65 100644
--- a/README.md
+++ b/README.md
@@ -4,8 +4,8 @@
 
 Inside every AI-powered pipeline, the unreliable part is the same: the LLM call itself.
 Silent failures, untestable outputs, no guarantees.
-Mellea wraps those calls in Python you can read, test, and reason about —
-type-annotated outputs, verifiable requirements, automatic retries.
+Mellea is a Python library for writing *generative programs* — replacing brittle prompts and flaky agents
+with structured, testable AI workflows built around type-annotated outputs, verifiable requirements, and automatic retries.
 
 [//]: # ([![arXiv]&#40;https://img.shields.io/badge/arXiv-2408.09869-b31b1b.svg&#41;]&#40;https://arxiv.org/abs/2408.09869&#41;)
 [![Website](https://img.shields.io/badge/website-mellea.ai-blue)](https://mellea.ai/)
@@ -15,7 +15,7 @@ type-annotated outputs, verifiable requirements, automatic retries.
 [![uv](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/uv/main/assets/badge/v0.json)](https://github.com/astral-sh/uv)
 [![Ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)
 [![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white)](https://github.com/pre-commit/pre-commit)
-[![GitHub License](https://img.shields.io/github/license/generative-computing/mellea)](https://img.shields.io/github/license/generative-computing/mellea)
+[![GitHub License](https://img.shields.io/github/license/generative-computing/mellea)](https://github.com/generative-computing/mellea/blob/main/LICENSE)
 [![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-3.0-4baaaa.svg)](CODE_OF_CONDUCT.md)
 [![Discord](https://img.shields.io/discord/1448407063813165219?logo=discord&logoColor=white&label=Discord&color=7289DA)](https://ibm.biz/mellea-discord)
 
@@ -30,7 +30,7 @@ See [installation docs](https://docs.mellea.ai/getting-started/installation) for
 ## Example
 
 The `@generative` decorator turns a typed Python function into a structured LLM call.
-Docstrings become prompts, type hints become schemas — no parsers, no chains:
+Docstrings become prompts, type hints become schemas — no templates, no parsers:
 
 ```python
 from pydantic import BaseModel
@@ -57,7 +57,7 @@ print(user.age)   # 31 — always an int, guaranteed by the schema
 | [mellea.ai](https://mellea.ai) | Vision, features, and live demos |
 | [docs.mellea.ai](https://docs.mellea.ai) | Full docs — tutorials, API reference, how-to guides |
 | [Colab notebooks](docs/examples/notebooks/) | Interactive examples you can run immediately |
-| [Code examples](docs/examples/) | Runnable examples: RAG, agents, IVR, MObjects, and more |
+| [Code examples](docs/examples/) | Runnable examples: RAG, agents, Instruct-Validate-Repair (IVR), MObjects, and more |
 
 ## Contributing
 

From 19e205e4c9115acf63ec52a854a75490cc7b1d5d Mon Sep 17 00:00:00 2001
From: Nigel Jones <jonesn@uk.ibm.com>
Date: Wed, 18 Mar 2026 11:30:35 +0000
Subject: [PATCH 04/13] docs: replace Discord link with GitHub Discussions

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 1443f6e65..cf26811ab 100644
--- a/README.md
+++ b/README.md
@@ -67,7 +67,7 @@ We welcome contributions of all kinds — bug fixes, new backends, standard libr
 - **[Building Extensions](https://docs.mellea.ai/community/building-extensions)** — create reusable components in your own repo
 - **[mellea-contribs](https://github.com/generative-computing/mellea-contribs)** — community library for shared components
 
-Questions? Join our [Discord](https://ibm.biz/mellea-discord).
+Questions? Open a [GitHub Discussion](https://github.com/generative-computing/mellea/discussions).
 
 ### IBM ❤️ Open Source AI
 

From d1167a3d016e95c86ca725b7fa01faf741d9a827 Mon Sep 17 00:00:00 2001
From: Nigel Jones <jonesn@uk.ibm.com>
Date: Wed, 18 Mar 2026 11:31:23 +0000
Subject: [PATCH 05/13] docs: remove Discord badge

---
 README.md | 1 -
 1 file changed, 1 deletion(-)

diff --git a/README.md b/README.md
index cf26811ab..c856984c1 100644
--- a/README.md
+++ b/README.md
@@ -17,7 +17,6 @@ with structured, testable AI workflows built around type-annotated outputs, veri
 [![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white)](https://github.com/pre-commit/pre-commit)
 [![GitHub License](https://img.shields.io/github/license/generative-computing/mellea)](https://github.com/generative-computing/mellea/blob/main/LICENSE)
 [![Contributor Covenant](https://img.shields.io/badge/Contributor%20Covenant-3.0-4baaaa.svg)](CODE_OF_CONDUCT.md)
-[![Discord](https://img.shields.io/discord/1448407063813165219?logo=discord&logoColor=white&label=Discord&color=7289DA)](https://ibm.biz/mellea-discord)
 
 ## Install
 

From 485d53f16b274bb5ae711eecd66159e7bba79a12 Mon Sep 17 00:00:00 2001
From: Nigel Jones <jonesn@uk.ibm.com>
Date: Wed, 18 Mar 2026 11:34:51 +0000
Subject: [PATCH 06/13] docs: use GitHub Discussions, fix table header

---
 README.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index c856984c1..4fb7ef9ce 100644
--- a/README.md
+++ b/README.md
@@ -51,7 +51,7 @@ print(user.age)   # 31 — always an int, guaranteed by the schema
 
 ## Learn More
 
-| Resource | |
+| Resource | Description |
 |---|---|
 | [mellea.ai](https://mellea.ai) | Vision, features, and live demos |
 | [docs.mellea.ai](https://docs.mellea.ai) | Full docs — tutorials, API reference, how-to guides |
@@ -66,7 +66,7 @@ We welcome contributions of all kinds — bug fixes, new backends, standard libr
 - **[Building Extensions](https://docs.mellea.ai/community/building-extensions)** — create reusable components in your own repo
 - **[mellea-contribs](https://github.com/generative-computing/mellea-contribs)** — community library for shared components
 
-Questions? Open a [GitHub Discussion](https://github.com/generative-computing/mellea/discussions).
+Questions? See [GitHub Discussions](https://github.com/generative-computing/mellea/discussions).
 
 ### IBM ❤️ Open Source AI
 

From 214ef8c83145f3b5b2e65a84c5144cbc72bf9531 Mon Sep 17 00:00:00 2001
From: Nigel Jones <jonesn@uk.ibm.com>
Date: Wed, 18 Mar 2026 11:35:06 +0000
Subject: [PATCH 07/13] docs: fix landing page description

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 4fb7ef9ce..c0324c3e1 100644
--- a/README.md
+++ b/README.md
@@ -53,7 +53,7 @@ print(user.age)   # 31 — always an int, guaranteed by the schema
 
 | Resource | Description |
 |---|---|
-| [mellea.ai](https://mellea.ai) | Vision, features, and live demos |
+| [mellea.ai](https://mellea.ai) | Vision and features |
 | [docs.mellea.ai](https://docs.mellea.ai) | Full docs — tutorials, API reference, how-to guides |
 | [Colab notebooks](docs/examples/notebooks/) | Interactive examples you can run immediately |
 | [Code examples](docs/examples/) | Runnable examples: RAG, agents, Instruct-Validate-Repair (IVR), MObjects, and more |

From 62e44370dd8ca09b448861c0bbad0126fd556b0e Mon Sep 17 00:00:00 2001
From: Nigel Jones <jonesn@uk.ibm.com>
Date: Wed, 18 Mar 2026 11:42:32 +0000
Subject: [PATCH 08/13] docs: add capabilities section, fix table style

---
 README.md | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/README.md b/README.md
index c0324c3e1..787098c35 100644
--- a/README.md
+++ b/README.md
@@ -49,10 +49,19 @@ print(user.name)  # Alice
 print(user.age)   # 31 — always an int, guaranteed by the schema
 ```
 
+## What Mellea Does
+
+- **Structured output** — `@generative` turns typed functions into LLM calls; Pydantic schemas are enforced at generation time
+- **Requirements & repair** — attach natural-language requirements to any call; Mellea validates and retries automatically
+- **Sampling strategies** — rejection sampling, majority voting, inference-time scaling with one parameter change
+- **Multiple backends** — Ollama, OpenAI, vLLM, HuggingFace, WatsonX, LiteLLM, Bedrock
+- **Legacy integration** — drop Mellea into existing codebases with `mify`
+- **MCP compatible** — expose any generative program as an MCP tool
+
 ## Learn More
 
 | Resource | Description |
-|---|---|
+| --- | --- |
 | [mellea.ai](https://mellea.ai) | Vision and features |
 | [docs.mellea.ai](https://docs.mellea.ai) | Full docs — tutorials, API reference, how-to guides |
 | [Colab notebooks](docs/examples/notebooks/) | Interactive examples you can run immediately |

From b4a3588ad1809dae2ac63c721dbacd0598f5cf03 Mon Sep 17 00:00:00 2001
From: Nigel Jones <nigel.l.jones+git@gmail.com>
Date: Wed, 18 Mar 2026 12:43:45 +0000
Subject: [PATCH 09/13] Update README.md

Co-authored-by: Paul Schweigert <paul@paulschweigert.com>
---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 787098c35..95be4943a 100644
--- a/README.md
+++ b/README.md
@@ -2,7 +2,7 @@
 
 # Mellea — build predictable AI without guesswork
 
-Inside every AI-powered pipeline, the unreliable part is the same: the LLM call itself.
+Inside every AI-powered pipeline, the unreliable part is the same: the LLM calls itself.
 Silent failures, untestable outputs, no guarantees.
 Mellea is a Python library for writing *generative programs* — replacing brittle prompts and flaky agents
 with structured, testable AI workflows built around type-annotated outputs, verifiable requirements, and automatic retries.

From 99680fe21cf3ae45f00735efefcbe324c1c24e1a Mon Sep 17 00:00:00 2001
From: Nigel Jones <nigel.l.jones+git@gmail.com>
Date: Wed, 18 Mar 2026 12:46:41 +0000
Subject: [PATCH 10/13] Update README.md

Co-authored-by: Paul Schweigert <paul@paulschweigert.com>
---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 95be4943a..5074d8058 100644
--- a/README.md
+++ b/README.md
@@ -55,7 +55,7 @@ print(user.age)   # 31 — always an int, guaranteed by the schema
 - **Requirements & repair** — attach natural-language requirements to any call; Mellea validates and retries automatically
 - **Sampling strategies** — rejection sampling, majority voting, inference-time scaling with one parameter change
 - **Multiple backends** — Ollama, OpenAI, vLLM, HuggingFace, WatsonX, LiteLLM, Bedrock
-- **Legacy integration** — drop Mellea into existing codebases with `mify`
+- **Legacy integration** — easily drop Mellea into existing codebases with `mify`
 - **MCP compatible** — expose any generative program as an MCP tool
 
 ## Learn More

From 15afd1c9fa94de94c85287e09355b20fb1442201 Mon Sep 17 00:00:00 2001
From: Nigel Jones <jonesn@uk.ibm.com>
Date: Wed, 18 Mar 2026 13:03:23 +0000
Subject: [PATCH 11/13] docs: fix grammar, clarify sampling strategies
 description

---
 README.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index 5074d8058..6c8cfea5f 100644
--- a/README.md
+++ b/README.md
@@ -2,7 +2,7 @@
 
 # Mellea — build predictable AI without guesswork
 
-Inside every AI-powered pipeline, the unreliable part is the same: the LLM calls itself.
+Inside every AI-powered pipeline, the unreliable part is the same: the LLM call itself.
 Silent failures, untestable outputs, no guarantees.
 Mellea is a Python library for writing *generative programs* — replacing brittle prompts and flaky agents
 with structured, testable AI workflows built around type-annotated outputs, verifiable requirements, and automatic retries.
@@ -53,7 +53,7 @@ print(user.age)   # 31 — always an int, guaranteed by the schema
 
 - **Structured output** — `@generative` turns typed functions into LLM calls; Pydantic schemas are enforced at generation time
 - **Requirements & repair** — attach natural-language requirements to any call; Mellea validates and retries automatically
-- **Sampling strategies** — rejection sampling, majority voting, inference-time scaling with one parameter change
+- **Sampling strategies** — run a generation multiple times and pick the best result; swap between rejection sampling, majority voting, and more with one parameter change
 - **Multiple backends** — Ollama, OpenAI, vLLM, HuggingFace, WatsonX, LiteLLM, Bedrock
 - **Legacy integration** — easily drop Mellea into existing codebases with `mify`
 - **MCP compatible** — expose any generative program as an MCP tool

From 62016fb6eafddc1f133864e711f9056c0bd5cb21 Mon Sep 17 00:00:00 2001
From: Nigel Jones <jonesn@uk.ibm.com>
Date: Wed, 18 Mar 2026 14:26:45 +0000
Subject: [PATCH 12/13] ci: gate docstring quality and coverage in CI (#616)

Add a hard-fail docstring quality gate to the docs-publish workflow:
- New 'Docstring quality gate' step runs --quality --fail-on-quality
  --threshold 100; fails if any quality issue is found or coverage
  drops below 100% (both currently pass in CI)
- Existing audit_coverage step (soft-fail, threshold 80) retained for
  the summary coverage metric

Add typeddict_mismatch checks to audit_coverage.py:
- typeddict_phantom: Attributes: documents a field not declared in the TypedDict
- typeddict_undocumented: declared field absent from Attributes: section
- Mirrors the existing param_mismatch logic for functions

Pre-commit: enable --fail-on-quality on the manual-stage hook (CI is
the hard gate; hook remains stages: [manual] as docs must be pre-built).

Update CONTRIBUTING.md and docs/docs/guide/CONTRIBUTING.md with TypedDict
docstring requirements and the two new audit check kinds.
---
 .github/workflows/docs-publish.yml     | 42 ++++++++-----------
 .pre-commit-config.yaml                | 11 +++--
 CONTRIBUTING.md                        | 21 ++++++++++
 docs/docs/guide/CONTRIBUTING.md        |  5 +++
 tooling/docs-autogen/audit_coverage.py | 56 ++++++++++++++++++++++++--
 5 files changed, 100 insertions(+), 35 deletions(-)

diff --git a/.github/workflows/docs-publish.yml b/.github/workflows/docs-publish.yml
index 77ad0f4a5..dc7494863 100644
--- a/.github/workflows/docs-publish.yml
+++ b/.github/workflows/docs-publish.yml
@@ -105,10 +105,17 @@ jobs:
         id: audit_coverage
         run: |
           set -o pipefail
-          uv run python tooling/docs-autogen/audit_coverage.py --docs-dir docs/docs/api --threshold 80 --quality 2>&1 \
+          uv run python tooling/docs-autogen/audit_coverage.py --docs-dir docs/docs/api --threshold 80 2>&1 \
             | tee /tmp/audit_coverage.log
         continue-on-error: ${{ inputs.strict_validation != true }}
 
+      - name: Docstring quality gate
+        id: quality_gate
+        run: |
+          set -o pipefail
+          uv run python tooling/docs-autogen/audit_coverage.py --docs-dir docs/docs/api --quality --fail-on-quality --threshold 100 2>&1 \
+            | tee /tmp/quality_gate.log
+
       # -- Upload artifact for deploy job --------------------------------------
 
       - name: Upload docs artifact
@@ -141,12 +148,14 @@ jobs:
           markdownlint_outcome = "${{ steps.markdownlint.outcome }}"
           validate_outcome = "${{ steps.validate_mdx.outcome }}"
           coverage_outcome = "${{ steps.audit_coverage.outcome }}"
+          quality_gate_outcome = "${{ steps.quality_gate.outcome }}"
           strict = "${{ inputs.strict_validation }}" == "true"
           mode = "" if strict else " *(soft-fail)*"
 
           lint_log = read_log("/tmp/markdownlint.log")
           validate_log = read_log("/tmp/validate_mdx.log")
           coverage_log = read_log("/tmp/audit_coverage.log")
+          quality_gate_log = read_log("/tmp/quality_gate.log")
 
           # Count markdownlint issues (lines matching file:line:col format)
           lint_issues = len([l for l in lint_log.splitlines() if re.match(r'.+:\d+:\d+ ', l)])
@@ -186,27 +195,11 @@ jobs:
 
           mdx_detail = parse_validate_detail(validate_log)
 
-          # Docstring quality annotation emitted by audit_coverage.py into the log
+          # Parse docstring quality annotation from quality gate log
           # Format: ::notice title=Docstring quality::message
-          #      or ::warning title=Docstring quality::message
-          quality_match = re.search(r"::(notice|warning|error) title=Docstring quality::(.+)", coverage_log)
-          if quality_match:
-              quality_level, quality_msg = quality_match.group(1), quality_match.group(2)
-              quality_icon = "✅" if quality_level == "notice" else "⚠️"
-              quality_status = "pass" if quality_level == "notice" else "warning"
-              quality_detail = re.sub(r"\s*—\s*see job summary.*$", "", quality_msg)
-              quality_row = f"| Docstring Quality | {quality_icon} {quality_status}{mode} | {quality_detail} |"
-          else:
-              quality_row = None
-
-          # Split coverage log at quality section to avoid duplicate output in collapsibles
-          quality_start = coverage_log.find("🔬 Running docstring quality")
-          if quality_start != -1:
-              quality_log = coverage_log[quality_start:]
-              coverage_display_log = coverage_log[:quality_start].strip()
-          else:
-              quality_log = ""
-              coverage_display_log = coverage_log
+          #      or ::error title=Docstring quality::message
+          quality_gate_match = re.search(r"::(notice|warning|error) title=Docstring quality::(.+)", quality_gate_log)
+          quality_gate_detail = re.sub(r"\s*—\s*see job summary.*$", "", quality_gate_match.group(2)) if quality_gate_match else ""
 
           lines = [
               "## Docs Build — Validation Summary\n",
@@ -215,16 +208,15 @@ jobs:
               f"| Markdownlint | {icon(markdownlint_outcome)} {markdownlint_outcome}{mode} | {lint_detail} |",
               f"| MDX Validation | {icon(validate_outcome)} {validate_outcome}{mode} | {mdx_detail} |",
               f"| API Coverage | {icon(coverage_outcome)} {coverage_outcome}{mode} | {cov_detail} |",
+              f"| Docstring Quality | {icon(quality_gate_outcome)} {quality_gate_outcome} | {quality_gate_detail} |",
           ]
-          if quality_row:
-              lines.append(quality_row)
           lines.append("")
 
           for title, log, limit in [
               ("Markdownlint output", lint_log, 5_000),
               ("MDX validation output", validate_log, 5_000),
-              ("API coverage output", coverage_display_log, 5_000),
-              ("Docstring quality details", quality_log, 1_000_000),
+              ("API coverage output", coverage_log, 5_000),
+              ("Docstring quality details", quality_gate_log, 1_000_000),
           ]:
               if log:
                   lines += [
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
index 302fe676a..513ff68bb 100644
--- a/.pre-commit-config.yaml
+++ b/.pre-commit-config.yaml
@@ -51,13 +51,12 @@ repos:
         language: system
         pass_filenames: false
         files: (docs/docs/.*\.mdx$|tooling/docs-autogen/)
-      # TODO(#616): Move to normal commit flow once docstring quality issues reach 0.
-      # Griffe loads the full package (~10s), so this is manual-only for now to avoid
-      # slowing down every Python commit. Re-enable (remove stages: [manual]) and add
-      # --fail-on-quality once quality issues are resolved.
+      # Docstring quality gate — manual only (CI is the hard gate via docs-publish.yml).
+      # Run locally with: pre-commit run docs-docstring-quality --hook-stage manual
+      # Requires generated API docs (run `uv run python tooling/docs-autogen/build.py` first).
       - id: docs-docstring-quality
-        name: Audit docstring quality (informational)
-        entry: bash -c 'test -d docs/docs/api && uv run --no-sync python tooling/docs-autogen/audit_coverage.py --quality --docs-dir docs/docs/api || true'
+        name: Audit docstring quality
+        entry: uv run --no-sync python tooling/docs-autogen/audit_coverage.py --quality --fail-on-quality --threshold 0 --docs-dir docs/docs/api
         language: system
         pass_filenames: false
         files: (mellea/.*\.py$|cli/.*\.py$)
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index fc2b12b09..0e0cb9918 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -174,6 +174,25 @@ differs in type or behaviour from the constructor input — for example, when a
 argument is wrapped into a `CBlock`, or when a class-level constant is relevant to
 callers. Pure-echo entries that repeat `Args:` verbatim should be omitted.
 
+**`TypedDict` classes are a special case.** Their fields *are* the entire public
+contract, so when an `Attributes:` section is present it must exactly match the
+declared fields. The audit will flag:
+
+- `typeddict_phantom` — `Attributes:` documents a field that is not declared in the `TypedDict`
+- `typeddict_undocumented` — a declared field is absent from the `Attributes:` section
+
+```python
+class ConstraintResult(TypedDict):
+    """Result of a constraint check.
+
+    Attributes:
+        passed: Whether the constraint was satisfied.
+        reason: Human-readable explanation.
+    """
+    passed: bool
+    reason: str
+```
+
 #### Validating docstrings
 
 Run the coverage and quality audit to check your changes before committing:
@@ -194,6 +213,8 @@ Key checks the audit enforces:
 | `no_args` | Standalone function has params but no `Args:` section |
 | `no_returns` | Function has a non-trivial return annotation but no `Returns:` section |
 | `param_mismatch` | `Args:` documents names not present in the actual signature |
+| `typeddict_phantom` | `TypedDict` `Attributes:` documents a field not declared in the class |
+| `typeddict_undocumented` | `TypedDict` has a declared field absent from its `Attributes:` section |
 
 **IDE hover verification** — open any of these existing classes in VS Code and hover
 over the class name or a constructor call to confirm the hover card shows `Args:` once
diff --git a/docs/docs/guide/CONTRIBUTING.md b/docs/docs/guide/CONTRIBUTING.md
index fd0a58434..e92e4a5e4 100644
--- a/docs/docs/guide/CONTRIBUTING.md
+++ b/docs/docs/guide/CONTRIBUTING.md
@@ -353,6 +353,11 @@ Add `Attributes:` only when a stored value differs in type or behaviour from the
 input (e.g. a `str` wrapped into a `CBlock`, or a class-level constant).
 Pure-echo entries that repeat `Args:` verbatim should be omitted.
 
+**`TypedDict` classes** are a special case — their fields are the entire public contract,
+so when an `Attributes:` section is present it must exactly match the declared fields.
+The CI audit will fail on phantom fields (documented but not declared) and undocumented
+fields (declared but missing from `Attributes:`).
+
 See [CONTRIBUTING.md](../../CONTRIBUTING.md) for the full validation workflow.
 
 ---
diff --git a/tooling/docs-autogen/audit_coverage.py b/tooling/docs-autogen/audit_coverage.py
index eb9506c9f..353ae3288 100755
--- a/tooling/docs-autogen/audit_coverage.py
+++ b/tooling/docs-autogen/audit_coverage.py
@@ -102,6 +102,7 @@ def walk_module(module, module_path: str):
 # ---------------------------------------------------------------------------
 
 _ARGS_RE = re.compile(r"^\s*(Args|Arguments|Parameters)\s*:", re.MULTILINE)
+_TYPEDDICT_BASES = re.compile(r"\bTypedDict\b")
 _RETURNS_RE = re.compile(r"^\s*Returns\s*:", re.MULTILINE)
 _YIELDS_RE = re.compile(r"^\s*Yields\s*:", re.MULTILINE)
 _RAISES_RE = re.compile(r"^\s*Raises\s*:", re.MULTILINE)
@@ -274,6 +275,45 @@ def _check_member(member, full_path: str, short_threshold: int) -> list[dict]:
                     }
                 )
 
+        # TypedDict field mismatch check.
+        # Unlike regular classes (where Attributes: is optional under Option C),
+        # TypedDict fields *are* the entire public contract. When an Attributes:
+        # section exists, every entry must match an actual declared field and every
+        # declared field must appear — stale or missing entries are always a bug.
+        is_typeddict = any(
+            _TYPEDDICT_BASES.search(str(base))
+            for base in getattr(member, "bases", [])
+        )
+        if is_typeddict and _ATTRIBUTES_RE.search(doc_text):
+            attrs_block = re.search(
+                r"Attributes\s*:(.*?)(?:\n\s*\n|\Z)", doc_text, re.DOTALL
+            )
+            if attrs_block:
+                doc_field_names = set(_ARGS_ENTRY_RE.findall(attrs_block.group(1)))
+                actual_fields = {
+                    name
+                    for name, m in member.members.items()
+                    if not name.startswith("_") and getattr(m, "is_attribute", False)
+                }
+                phantom = doc_field_names - actual_fields
+                if phantom:
+                    issues.append(
+                        {
+                            "path": full_path,
+                            "kind": "typeddict_phantom",
+                            "detail": f"Attributes: documents {sorted(phantom)} not declared in TypedDict",
+                        }
+                    )
+                undocumented = actual_fields - doc_field_names
+                if undocumented:
+                    issues.append(
+                        {
+                            "path": full_path,
+                            "kind": "typeddict_undocumented",
+                            "detail": f"TypedDict fields {sorted(undocumented)} missing from Attributes: section",
+                        }
+                    )
+
     return issues
 
 
@@ -296,11 +336,15 @@ def audit_docstring_quality(
     - no_class_args: class whose __init__ has typed params but no Args section on the class
     - duplicate_init_args: Args: present in both class docstring and __init__ (Option C violation)
     - param_mismatch: Args section documents names absent from the real signature
+    - typeddict_phantom: TypedDict Attributes: section documents fields not declared in the class
+    - typeddict_undocumented: TypedDict has declared fields absent from its Attributes: section
 
-    Note: Attributes: sections are intentionally not enforced. Under the Option C
-    convention, Attributes: is only used when stored values differ in type or
-    behaviour from the constructor inputs (e.g. type transforms, computed values,
-    class constants). Pure-echo entries that repeat Args: verbatim are omitted.
+    Note: Attributes: sections are intentionally not enforced for regular classes. Under
+    the Option C convention, Attributes: is only used when stored values differ in type or
+    behaviour from the constructor inputs (e.g. type transforms, computed values, class
+    constants). Pure-echo entries that repeat Args: verbatim are omitted. TypedDicts are
+    a carve-out: their fields are the entire public contract, so when an Attributes:
+    section is present it must exactly match the declared fields.
 
     Only symbols (and methods whose parent class) present in `documented` are
     checked when that set is provided — ensuring the audit is scoped to what is
@@ -401,6 +445,8 @@ def _print_quality_report(issues: list[dict]) -> None:
         "no_class_args": "Missing class Args section",
         "duplicate_init_args": "Duplicate Args: in class + __init__ (Option C violation)",
         "param_mismatch": "Param name mismatches (documented but not in signature)",
+        "typeddict_phantom": "TypedDict phantom fields (documented but not declared)",
+        "typeddict_undocumented": "TypedDict undocumented fields (declared but missing from Attributes:)",
     }
 
     total = len(issues)
@@ -419,6 +465,8 @@ def _print_quality_report(issues: list[dict]) -> None:
         "no_class_args",
         "duplicate_init_args",
         "param_mismatch",
+        "typeddict_phantom",
+        "typeddict_undocumented",
     ):
         items = by_kind.get(kind, [])
         if not items:

From d777cfc6a29de1a4a36b79db5444b321e0b29a7b Mon Sep 17 00:00:00 2001
From: Nigel Jones <jonesn@uk.ibm.com>
Date: Wed, 18 Mar 2026 18:05:27 +0000
Subject: [PATCH 13/13] fix: always populate mot.usage in HuggingFace backend
 (#694)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Token count extraction in _post_process_async was gated behind
`span is not None or metrics_enabled`, so mot.usage was never
populated in plain (non-telemetry) runs. Now extracted unconditionally
— usage is a standard mot field, not a telemetry concern.
---
 mellea/backends/huggingface.py | 12 ++----------
 1 file changed, 2 insertions(+), 10 deletions(-)

diff --git a/mellea/backends/huggingface.py b/mellea/backends/huggingface.py
index 424d5b2f3..e6236e5c6 100644
--- a/mellea/backends/huggingface.py
+++ b/mellea/backends/huggingface.py
@@ -1133,18 +1133,11 @@ class used during generation, if any.
         )
 
         span = mot._meta.get("_telemetry_span")
-        from ..telemetry.metrics import is_metrics_enabled
 
-        metrics_enabled = is_metrics_enabled()
-
-        # Extract token counts only if needed
+        # Derive token counts from the output sequences (HF models have no usage object).
         hf_output = mot._meta.get("hf_output")
         n_prompt, n_completion = None, None
-        if (span is not None or metrics_enabled) and isinstance(
-            hf_output, GenerateDecoderOnlyOutput
-        ):
-            # HuggingFace local models don't provide usage objects, but we can
-            # calculate token counts from sequences
+        if isinstance(hf_output, GenerateDecoderOnlyOutput):
             try:
                 if input_ids is not None and hf_output.sequences is not None:
                     n_prompt = input_ids.shape[1]
@@ -1152,7 +1145,6 @@ class used during generation, if any.
             except Exception:
                 pass
 
-        # Populate standardized usage field (convert to OpenAI format)
         if n_prompt is not None and n_completion is not None:
             mot.usage = {
                 "prompt_tokens": n_prompt,