Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 25 additions & 6 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,11 @@ make docs # Generates Mermaid diagrams from YAML schemas

## C Coding Conventions

### Comments

- Only comment when necessary to explain non-obvious code or rationale.
- Keep comments terse and on point.

### Generated union constructor functions

- `new<Union>_<Variant>(parserInfo, variant)` - Wraps an existing variant in a union
Expand All @@ -88,6 +93,13 @@ make docs # Generates Mermaid diagrams from YAML schemas

- Always use explicit NULL comparisons: `if (ptr != NULL)` or `if (ptr == NULL)`

### Boolean values

- Use `bool` for variables and return values that represent truth values.
- Do not use `int` as a boolean substitute when the API returns boolean semantics.
- Prefer direct boolean checks (e.g., `if (flag)`) and assertions (e.g., `assert(flag)`).
- Example: `bool matches = eqTerm(actual, expected); assert(matches);`

### Naming Conventions

- Types: `MixedCase` (e.g., `LamExp`, `AstExpression`)
Expand All @@ -96,12 +108,13 @@ make docs # Generates Mermaid diagrams from YAML schemas

## Documentation Style

- Use simple periods instead of exclamation points
- Avoid hyperbole: use "significant", "notable" instead of "HUGE", "Amazing"
- Avoid emphatic modifiers in headings and verdict-style declarations
- No emoji
- Minimize bold emphasis on routine statements
- Follow markdownlint rules
- Use simple periods instead of exclamation points.
- Avoid hyperbole: use "significant", "notable" instead of "HUGE", "Amazing".
- Avoid emphatic modifiers in headings and verdict-style declarations.
- No emoji.
- Minimize bold emphasis on routine statements.
- Follow markdownlint rules.
- Prefer Mermaid for diagrams.

## Debugging

Expand Down Expand Up @@ -170,6 +183,12 @@ For detailed information on specific compiler stages, see:
- [anf.md](../docs/agent/anf.md) - A-Normal Form conversion
- [language-syntax.md](../docs/agent/language-syntax.md) - F♮ language reference

## Rewrite Prototyping

For guidance on the self-hosting/prototyping pipeline in `fn/rewrite`, including `test_harness.fn`, pass ordering, and `samples.fn` usage, see:

- [rewrite-self-hosting-guide.md](../docs/agents/rewrite-self-hosting-guide.md)

## When Reading Code

- Start at `src/main.c` for overall flow
Expand Down
16 changes: 12 additions & 4 deletions .github/workflows/makefile.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,15 +10,23 @@ jobs:
build:

runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
compiler: [gcc, clang]

steps:
- uses: actions/checkout@v3

- name: Install sqlite
run: make install-sqlite3

- name: Build executable
run: make CCC=gcc MODE=testing
- name: Install clang
if: matrix.compiler == 'clang'
run: sudo apt-get update && sudo apt-get --yes install clang

- name: Run tests
run: make test CCC=gcc MODE=testing
- name: Build executable (${{ matrix.compiler }})
run: make CCC=${{ matrix.compiler }} MODE=testing

- name: Run tests (${{ matrix.compiler }})
run: make test CCC=${{ matrix.compiler }} MODE=testing
61 changes: 61 additions & 0 deletions docs/CPS_COMPLETE_GUIDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -185,6 +185,67 @@ When examining CPS output, verify:

---

## Optimization Staging Around CPS

The project already has working CPS transforms in [src/lambda_cpsTc.c](src/lambda_cpsTc.c) and [src/lambda_cpsTk.c](src/lambda_cpsTk.c).

A practical optimization schedule is:

1. Light simplification before CPS.
2. CPS transform.
3. Aggressive administrative reduction after CPS.

### Why split optimization this way

- Pre-CPS simplification reduces CPS output size and compile-time churn.
- Post-CPS simplification removes the large volume of administrative redexes introduced by CPS.
- In a strict language with effects and letrec, aggressive source-level beta/eta is riskier than CPS-level cleanup.

### Pre-CPS: keep it conservative

Before CPS, keep reductions safe and local:

- Beta only when argument shape is known-safe under call-by-value policy.
- Eta only when no effect or recursion-order hazard is introduced.
- Letrec-aware eta should not contract wrappers that reference letrec-bound symbols.

This keeps source semantics stable before control flow is made explicit.

### Post-CPS: do the heavy cleanup

After CPS, prioritize administrative reductions:

- Beta: contract immediate continuation wrappers and one-shot binders.
- Eta: remove continuation forwarding wrappers when they are pure forwarding.
- Dead continuation bindings: drop continuation lambdas that are never used.

Typical wins come from patterns like:

```fn
((λ (k) body) c)
```

and

```fn
(λ (x k) (f x k))
```

when no letrec-sensitive recursion or effect ordering is changed.

### Suggested safety checks for CPS-era eta

For a candidate contraction:

- Ensure forwarded arguments are exactly the lambda parameters.
- Ensure no duplicated evaluation is introduced.
- Ensure recursive group symbols are not crossed in a way that changes forcing/arity behavior.
- Ensure effectful primitives are not reordered.

This gives a robust default: small, safe pre-CPS simplification and high-leverage post-CPS normalization.

---

## Examples with Expected Output

### Example 1: Simple Application
Expand Down
1 change: 1 addition & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
* [ANF](ANF.md) A-Normalization
* [Arithmetic](Arithmetic.md) Notes on rational and complex arithmetic.
* [CODEGEN](CODEGEN.md) Notes on the code generator utility.
* [CPS COMPLETE GUIDE](CPS_COMPLETE_GUIDE.md) CPS transformation notes and optimization staging.
* [ENV](ENV.md) Abandoned plan to have first-class environments.
* [LEXICAL ADDRESSING](LEXICAL_ADDRESSING.md) De-Bruijn indexing for fast variable look up.
* [MACROS](MACROS.md) Initial thoughts on a simple macro system.
Expand Down
2 changes: 0 additions & 2 deletions docs/TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,6 @@
More of a wish-list than a hard and fast plan.

* Simplify
* Add a beta-reduction pass (after cps-transform).
* Add an eta-reduction pass.
* Add a constant/operator folding pass after beta and eta reduction.
* Target LLVM
* `syntax` construct that allows large-scale syntactic structures to be defined by the user.
Expand Down
30 changes: 29 additions & 1 deletion docs/agent/code-generation.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ The build depends heavily on Python code generation. **Do not manually edit file

## Overview

The code generator is modular: the main entry point is `tools/generate.py`, which orchestrates the `generate` Python package (in `tools/generate/`). This package contains all logic for parsing YAML schemas and generating C code for all compiler stages. Contains modules for each type of generated structure (structs, discriminated unions, hashes, arrays etc.)
The code generator is modular: the main entry point is `tools/generate.py`, which orchestrates the `generate` Python package (in `tools/generate/`). This package contains all logic for parsing YAML schemas and generating C code for all compiler stages. It contains modules for each type of generated structure (structs, discriminated unions, hashes, arrays etc.)

## YAML Schema Structure

Expand Down Expand Up @@ -70,6 +70,16 @@ hashes:

The yaml may also contain an `inline` section which in turn can contain arrays, unions and structs. These inline variants are not separately memory managed (no GC header), are often passed by value, and may be used as components of structs without the extra pointer indirection.

The yaml may also contain an `external` section:

```yaml
external:
- !include tc.yaml
- !include utils.yaml
```

External includes are loaded into the same `Catalog`, so cross-stage type information is available during generation. These entries are flagged as external and are not emitted as local generated definitions.

## Primitives (`src/primitives.yaml`)

Common types shared across all stages - referenced via `!include primitives.yaml`:
Expand Down Expand Up @@ -139,13 +149,31 @@ For each struct/union, the code generator produces:
4. **Include headers** in your C code: `#include "<stage>.h"`
5. **Use generated functions** - no manual memory management code needed

## Manual Visitor Boilerplate

Visitor generation is primarily a manual scaffolding workflow for creating an initial C file that is then edited by hand.

Use:

```bash
python3 tools/generate.py src/<stage>.yaml visitor --target=<suffix> > generated/<stage>_<suffix>_visitor.c
```

Notes:

- `target` prefixes generated function names (example: `cpsTkLamExp`).
- The generated file includes `#include "<stage>_<suffix>.h"`; ensure that header exists before compiling.
- Visitor output includes only non-external entities from the current stage YAML.
- Regenerate only when re-scaffolding. Once manual edits begin, treat the visitor file as hand-maintained C code.

## Important Notes

- **ParserInfo**: If `parserInfo: true`, all structs get `ParserInfo I` field for error reporting source file and line number.
- **Auto-initialized fields**: Use `field: type=value` syntax in YAML to have constructor initialize the field automatically rather than requiring it as a parameter
- **GC Integration**: All generated `new*()` functions automatically register with GC
- **Type safety**: Generated code includes type checking in mark/free dispatchers
- **Documentation**: YAML `meta` blocks generate doxygen-style comments
- **External entries**: Types from `external:` are available for references and type resolution, but code generation only emits non-external entities for the current stage.

## Adding New Structures

Expand Down
17 changes: 17 additions & 0 deletions docs/agent/workflows.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,23 @@ The compiler uses two distinct mechanisms for reporting errors depending on thei

Check `utils_helper.[ch]` before implementing functions over common types defined in `utils.yaml`, if you do need to implement something, consider adding it to `utils_helper.[ch]` if it could be useful elsewhere.

## Root Shell Helpers (`utils.sh`)

There is a project-level shell helper file at `./utils.sh`. Agents working from the command line should be aware of it, and may source it when useful:

```bash
source ./utils.sh
```

Notable quality-of-life helpers include:

* `new_h <name>`: creates `src/<name>.h` with include guards and GPL header text.
* `new_c <name>`: creates `src/<name>.c` with GPL header text.
* `new_ch <name>`: creates matching `.c/.h` pair and adds `#include "<name>.h"` to the `.c` file.
* `new_visitor <stage> <suffix>`: creates `src/<stage>_<suffix>.h` and generates visitor boilerplate C from `src/<stage>.yaml`.

If an agent does not use these helpers directly, it should still follow the same creation patterns (especially GPL header insertion and header guard style) when creating new source files manually.

## Adding a Built-in Function

To add a new native function callable from F♮ code:
Expand Down
115 changes: 115 additions & 0 deletions docs/agents/rewrite-self-hosting-guide.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
# Rewrite self-hosting guide for agents

This document explains the `fn/rewrite` area so agents can reason about it as a prototype pipeline for compiler stages.

## Why this folder matters

`fn/rewrite` is not just experimental code. It is a staging area where algorithms are prototyped in the language itself before or alongside C implementations.

Recent and relevant examples:

- β-reduction is implemented in `fn/rewrite/beta_reduce.fn`.
- η-reduction is implemented in `fn/rewrite/eta_reduce.fn`.
- Constant/operator folding is implemented in `fn/rewrite/constant_folding.fn`.
- Closure conversion variants are in `fn/rewrite/closure-convert.fn`.

Treat this directory as a design/prototyping reference when working on equivalent C stages.

## Primary entrypoint

Start with `fn/rewrite/test_harness.fn`.

It loads:

- `samples.fn` input corpus.
- Front-end representation and parser from `expr.fn`.
- Lower-level representation from `minexpr.fn`.
- Transform passes: `desugar`, `curry`, `eta_reduce`, `cps`, `beta_reduce`, `constant_folding`, `closure-convert`.

### Execution

From repo root:

```bash
./bin/fn fn/rewrite/test_harness.fn
```

The harness prints each sample and a transformed result. By default, intermediate pretty-prints are commented out in the file, but can be re-enabled for debugging pass-by-pass output.

## Data flow in the harness

For each non-comment entry in `samples.fn`:

1. Parse source string with `E.parse` (`expr` IR).
2. `DS.desugar` lowers to `minexpr` and removes/normalizes syntax forms.
3. `C.curry` enforces unary application/lambda shape.
4. `η.reduce` performs eta reduction.
5. `CPS.T_c(..., M.var("□"))` runs CPS conversion with a hole continuation.
6. `β.reduce` contracts lambda applications and handles arity mismatch cases.
7. `OF.fold` applies algebraic and constant simplification.
8. `CC.shared_closure_convert` computes shared closure-converted output.

Important current behavior: the harness computes step 8 but currently prints step 7 (`g`) by default.

## The sample corpus

`fn/rewrite/samples.fn` provides:

- General language constructs (`let`, `letrec`, lambdas, conditionals, `call/cc`, `amb`, constructors, vectors, namespaces, etc.).
- A large arithmetic/optimization section focused on constant-folding behavior.
- Several explicit η-reduction-focused samples near the end.
- Inline comments as strings beginning with `";"` that the harness prints as section labels.

Use this corpus first when validating behavioral changes in rewrite passes.

## Representation split to keep in mind

- `expr.fn` is intended to broadly mirror `src/lambda.yaml` (LamExp-like shapes in the rewrite prototype).
- `desugar.fn` lowers from `expr.fn` to `minexpr.fn`.
- `minexpr.fn` is intended to broadly mirror `src/minlam.yaml` (MinExp-like reduced forms).
- All subsequent transforms in the main rewrite pipeline operate on `minexpr.fn` values.

This mirrors the C compiler flow where desugaring lowers `LamExp` to `MinExp` before later optimization/normalization stages.

Many transform bugs come from accidentally assuming an `expr`/LamExp-like form still exists after desugaring, or from printing/parsing assumptions between these two representations.

## Key transform files and purpose

- `desugar.fn`: lowers `expr` to `minexpr`, rewrites `let*`, transforms some constructs to primitive forms, and handles partial/over-application shaping via arity context.
- `curry.fn`: rewrites lambdas and applications into curried form.
- `eta_reduce.fn`: removes `λx.(f x)`-style wrappers when safe (`occurs_in` guard).
- `cps.fn`: CPS transform (`T_k`, `T_c`, and list helper `Ts_k`).
- `beta_reduce.fn`: β-reduction with explicit handling for exact/under/over application.
- `constant_folding.fn`: algebraic simplifier plus recursive fold over `minexpr`.
- `closure-convert.fn`: closure conversion with `flat_closure_convert` (bottom-up) and `shared_closure_convert` (top-down) using `transform.fn` traversal helpers.

## Relationship to the C compiler pipeline

Use this rewrite pipeline as a conceptual mirror for C stages, not as an exact 1:1 implementation.

IR correspondence first:

- `fn/rewrite/expr.fn` ≈ `src/lambda.yaml`
- `fn/rewrite/minexpr.fn` ≈ `src/minlam.yaml`
- `fn/rewrite/desugar.fn` performs the same broad lowering role as `src/lambda_desugar.c` (`LamExp`-like to `MinExp`-like)

Useful rough correspondences:

- `cps.fn` ↔ CPS/continuation strategy used in `src/lambda_cpsTk.c` and `src/lambda_cpsTc.c`
- `beta_reduce.fn` / `eta_reduce.fn` ↔ min-lambda simplification ideas (`src/minlam_*.c`)
- `constant_folding.fn` ↔ arithmetic simplification ideas (see `src/arithmetic.c` and related optimization logic)
- `closure-convert.fn` ↔ closure conversion concepts in lambda conversion/runtime lowering

When adding or changing a C-stage algorithm, check whether `fn/rewrite` already has a compact version that clarifies intent or edge-case behavior.

## Agent workflow recommendations

When asked to modify a rewrite pass:

1. Start from `test_harness.fn` and identify where in the sequence the pass runs.
2. Add or adjust cases in `samples.fn` close to the affected feature area.
3. Toggle intermediate prints in `test_harness.fn` to isolate first divergent stage.
4. Keep changes constrained to one IR layer (`expr` or `minexpr`) per edit when possible.
5. If a rewrite behavior is intended to inform C code, document the invariant in both places.

This keeps prototype and production-stage behavior aligned as self-hosting efforts expand.
20 changes: 20 additions & 0 deletions docs/generated/term.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# term

Specific arithmetic structures for performing constant folding.

```mermaid
flowchart LR
TermOp --left--> Term
TermOp --right--> Term
TermValue --value--> Value
Term --add--> TermOp
Term --sub--> TermOp
Term --mul--> TermOp
Term --div--> TermOp
Term --mod--> TermOp
Term --pow--> TermOp
Term --num--> TermValue
Term --other--> MinExp
```

> Generated from src/term.yaml by tools/generate.py
Loading