Skip to content

Commit 4e104cb

Browse files
mpecanclaude
andauthored
feat(filter,cli,server): sandbox CLI Lua execution and inline scripts on publish (#194)
## Summary - **Extract `tokf-filter` crate** with the filter engine and sandboxed Lua execution, shared between CLI and server - **Sandbox all Lua execution** — remove unsandboxed `run_lua_script()`; all paths now use `run_lua_script_sandboxed()` with instruction-count (1M) and memory (16MB) limits - **Unify `apply()`/`apply_sandboxed()`** into shared `apply_internal()`, eliminating ~96% code duplication - **Auto-inline external Lua scripts on publish** — `tokf publish` reads `lua_script.file`, embeds as inline `source`, with path traversal protection via `canonicalize()` + `starts_with()` - **Server-side test verification** — filters are verified against their test suites before persisting to storage (both publish and test update endpoints) - **Fail-fast server validation** — rejects `lua_script.file` before hashing/upload with a helpful hint - **Documentation updates** — new `publishing-filters.md`, updated Lua escape hatch docs with sandbox limits, updated writing-filters with `[lua_script]` reference - **File size compliance** — split `publish.rs` and `update_tests.rs` into directory modules (tests extracted to sibling files) ## Test plan - [x] All 1152 workspace tests pass (`cargo test --workspace`) - [x] Clippy clean (`cargo clippy --workspace --all-targets -- -D warnings`) - [x] File size limits pass (all files under 700 lines) - [ ] DB integration tests (`just test-db` — requires CockroachDB) - [ ] End-to-end tests (`just test-e2e` — requires CockroachDB) - [ ] Manual test: `tokf publish` with `lua_script.file` auto-inlines correctly - [ ] Manual test: path traversal (`lua_script.file = "../../etc/passwd"`) is rejected 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
1 parent 737d522 commit 4e104cb

53 files changed

Lines changed: 2975 additions & 1574 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.dupes-ignore.toml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -117,3 +117,11 @@ reason = "TestHarness::blocking_update_tests/blocking_search_filters - spawn_blo
117117
[[ignore]]
118118
fingerprint = "9fc57c07628e198e"
119119
reason = "TestHarness::blocking_gain/blocking_list_machines - spawn_blocking test boilerplate; test clarity > DRY"
120+
121+
[[ignore]]
122+
fingerprint = "ccbacfce8575cd81"
123+
reason = "publish_filter_rejects_missing_* tests - same post+assert+check-error pattern with different payloads and error messages; test clarity > DRY"
124+
125+
[[ignore]]
126+
fingerprint = "91e815e8a2c7146b"
127+
reason = "publish_filter_rejects_oversized/failing tests - same post+assert+check-error pattern with different inputs and assertions; test clarity > DRY"

.github/workflows/deploy-server.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ on:
66
paths:
77
- "crates/tokf-server/**"
88
- "crates/tokf-common/**"
9+
- "crates/tokf-filter/**"
910
- "Cargo.toml"
1011
- "Cargo.lock"
1112
- "Dockerfile"

CONTRIBUTING.md

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -108,9 +108,16 @@ Every filter in the stdlib **must** have a `_test/` suite — CI enforces this w
108108

109109
## Lua filters
110110

111-
For filters that need logic beyond what TOML can express, use the `[lua_script]` section with [Luau](https://luau.org/). The sandbox blocks `io`, `os`, and `package` — scripts cannot access the filesystem or network.
111+
For filters that need logic beyond what TOML can express, use the `[lua_script]` section with [Luau](https://luau.org/).
112112

113-
See the [README](README.md#lua-escape-hatch) for the full API and the built-in filter library for examples.
113+
All Lua execution is sandboxed:
114+
115+
- **Blocked libraries:** `io`, `os`, `package` — no filesystem or network access.
116+
- **Resource limits:** 1 million VM instructions, 16 MB memory (prevents infinite loops and memory exhaustion).
117+
118+
For local development, you can reference external scripts with `lua_script.file = "script.luau"`. For published filters, use inline `source``tokf publish` automatically inlines file references before uploading.
119+
120+
See `docs/lua-escape-hatch.md` for the full API, globals, and examples.
114121

115122
---
116123

Cargo.lock

Lines changed: 14 additions & 1 deletion
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[workspace]
2-
members = ["crates/tokf-cli", "crates/tokf-common", "crates/tokf-server", "crates/crdb-test-macro", "crates/e2e-tests"]
2+
members = ["crates/tokf-cli", "crates/tokf-common", "crates/tokf-filter", "crates/tokf-server", "crates/crdb-test-macro", "crates/e2e-tests"]
33
resolver = "2"
44

55
[workspace.package]

Dockerfile

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
11
# syntax=docker/dockerfile:1
22
# ── Stage 1: build ──────────────────────────────────────────────────────────
33
FROM rust:slim AS builder
4+
# g++ is required for mlua's vendored Luau build (C++ source compiled via cc crate)
5+
RUN apt-get update && apt-get install -y --no-install-recommends g++ && rm -rf /var/lib/apt/lists/*
46
WORKDIR /app
57
COPY Cargo.toml Cargo.lock ./
68
COPY crates/ crates/

README.md

Lines changed: 76 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -298,6 +298,12 @@ collapse_empty_lines = true # collapse consecutive blank lines into one
298298

299299
show_history_hint = true # append a hint line pointing to the full output in history
300300

301+
# Lua escape hatch — for logic TOML can't express (see Lua Escape Hatch section)
302+
[lua_script]
303+
lang = "luau"
304+
source = 'return output:upper()' # inline script
305+
# file = "transform.luau" # or reference a local file (auto-inlined on publish)
306+
301307
match_output = [ # whole-output substring checks, short-circuit the pipeline
302308
{ contains = "rejected", output = "push rejected" },
303309
]
@@ -453,7 +459,28 @@ end
453459

454460
Available globals: `output` (string), `exit_code` (integer — the underlying command's real exit code, unaffected by `--no-mask-exit-code`), `args` (table).
455461
Return a string to replace output, or `nil` to fall through to the rest of the TOML pipeline.
456-
The sandbox blocks `io`, `os`, and `package` — no filesystem or network access from scripts.
462+
463+
### Sandbox
464+
465+
All Lua execution is sandboxed — both in the CLI and on the server:
466+
467+
- **Blocked libraries:** `io`, `os`, `package` — no filesystem or network access.
468+
- **Instruction limit:** 1 million VM instructions (prevents infinite loops).
469+
- **Memory limit:** 16 MB (prevents memory exhaustion).
470+
471+
Scripts that exceed these limits are terminated and treated as a passthrough (the TOML pipeline continues as if no Lua script was configured).
472+
473+
### External script files
474+
475+
For local development you can keep the script in a separate `.luau` file:
476+
477+
```toml
478+
[lua_script]
479+
lang = "luau"
480+
file = "transform.luau"
481+
```
482+
483+
Only one of `file` or `source` may be set — not both. When you run `tokf publish`, file references are automatically inlined (the file content is embedded as `source`) so the published filter is self-contained. The script file must reside within the filter's directory — path traversal (e.g. `../secret.txt`) is rejected.
457484

458485
---
459486

@@ -887,9 +914,56 @@ tokf publish --update-tests git/push --dry-run # preview only
887914

888915
---
889916

917+
---
918+
919+
890920
## Publishing a Filter
891921

892-
See [Publishing Filters](./publishing-filters.md) for how to share your own filters.
922+
```sh
923+
tokf publish <filter-name>
924+
```
925+
926+
Publishes a local filter to the community registry under the MIT license. Authentication is required — run `tokf auth login` first.
927+
928+
### Requirements
929+
930+
- The filter must be a **user-level or project-local** filter (not a built-in). Use `tokf eject` first if needed.
931+
- At least one **test file** must exist in the adjacent `_test/` directory. The server runs these tests against your filter before accepting the upload.
932+
- You must accept the **MIT license** (prompted on first publish, remembered afterwards).
933+
934+
### What happens on publish
935+
936+
1. The filter TOML is read and validated.
937+
2. If the filter uses `lua_script.file`, the referenced script is **automatically inlined** — its content is embedded as `lua_script.source` so the published filter is self-contained. The script file must reside within the filter's directory (path traversal is rejected).
938+
3. A content hash is computed from the parsed config. This hash is the filter's permanent identity.
939+
4. The filter and test files are uploaded. The server verifies tests pass before accepting.
940+
5. On success, the registry URL is printed.
941+
942+
### Options
943+
944+
| Flag | Description |
945+
|------|-------------|
946+
| `--dry-run` | Preview what would be published without uploading |
947+
| `--update-tests` | Replace the test suite for an already-published filter |
948+
949+
### Examples
950+
951+
```sh
952+
tokf publish git/push # publish a filter
953+
tokf publish git/push --dry-run # preview only
954+
tokf publish --update-tests git/push # replace test suite
955+
```
956+
957+
### Size limits
958+
959+
- Filter TOML: 64 KB max
960+
- Total upload (filter + tests): 1 MB max
961+
962+
### Lua scripts in published filters
963+
964+
Published filters must use **inline `source`** for Lua scripts — `lua_script.file` is not supported on the server. The `tokf publish` command handles this automatically by reading the file and embedding its content. You don't need to change your filter.
965+
966+
All Lua scripts in published filters are executed in a sandbox with resource limits (1 million instructions, 16 MB memory) during server-side test verification.
893967

894968
---
895969

crates/e2e-tests/tests/publish_flow.rs

Lines changed: 18 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -11,16 +11,26 @@ mod harness;
1111
const FILTER_TOML: &[u8] = b"command = \"git push\"\n";
1212

1313
fn valid_test(name: &str) -> (String, Vec<u8>) {
14-
let content = format!("name = \"{name}\"\n\n[[expect]]\ncontains = \"ok\"\n");
14+
let content =
15+
format!("name = \"{name}\"\ninline = \"ok output\"\n\n[[expect]]\ncontains = \"ok\"\n");
1516
(format!("{name}.toml"), content.into_bytes())
1617
}
1718

19+
fn default_test() -> (String, Vec<u8>) {
20+
(
21+
"default.toml".to_string(),
22+
b"name = \"default\"\ninline = \"\"\n\n[[expect]]\nequals = \"\"\n".to_vec(),
23+
)
24+
}
25+
1826
/// Publish a filter → verify is_new=true, hash returned.
1927
#[crdb_test_macro::crdb_test(migrations = "../tokf-server/migrations")]
2028
async fn publish_filter_returns_hash(pool: PgPool) {
2129
let h = harness::TestHarness::with_storage(pool).await;
2230

23-
let (is_new, resp) = h.blocking_publish(FILTER_TOML.to_vec(), vec![]).await;
31+
let (is_new, resp) = h
32+
.blocking_publish(FILTER_TOML.to_vec(), vec![default_test()])
33+
.await;
2434

2535
assert!(is_new, "expected is_new=true for first publish");
2636
assert!(!resp.content_hash.is_empty());
@@ -44,10 +54,14 @@ async fn publish_with_invalid_test_fails(pool: PgPool) {
4454
async fn duplicate_publish_returns_existing(pool: PgPool) {
4555
let h = harness::TestHarness::with_storage(pool).await;
4656

47-
let (is_new1, resp1) = h.blocking_publish(FILTER_TOML.to_vec(), vec![]).await;
57+
let (is_new1, resp1) = h
58+
.blocking_publish(FILTER_TOML.to_vec(), vec![default_test()])
59+
.await;
4860
assert!(is_new1);
4961

50-
let (is_new2, resp2) = h.blocking_publish(FILTER_TOML.to_vec(), vec![]).await;
62+
let (is_new2, resp2) = h
63+
.blocking_publish(FILTER_TOML.to_vec(), vec![default_test()])
64+
.await;
5165
assert!(!is_new2, "expected is_new=false for duplicate publish");
5266
assert_eq!(resp1.content_hash, resp2.content_hash);
5367
}

crates/tokf-cli/Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ path = "src/main.rs"
1515

1616
[dependencies]
1717
tokf-common = { path = "../tokf-common", version = "0.2.12", features = ["validation"] }
18+
tokf-filter = { path = "../tokf-filter", version = "0.2.12" }
1819
clap = { version = "4", features = ["derive", "env"] }
1920
toml = "1.0"
2021
serde = { version = "1", features = ["derive"] }
@@ -28,7 +29,6 @@ include_dir = { version = "0.7", features = ["glob"] }
2829
# with sqlx-sqlite's links = "sqlite3" constraint in the workspace resolver.
2930
rusqlite = { version = "0.32", features = ["bundled"] }
3031
rkyv = { version = "0.8", features = ["bytecheck", "unaligned"] }
31-
mlua = { version = "0.11.6", features = ["luau", "vendored", "error-send"] }
3232
reqwest = { version = "0.12", default-features = false, features = ["rustls-tls", "json", "blocking", "multipart"] }
3333
# linux-native uses kernel keyutils (no libdbus-sys dependency).
3434
# sync-secret-service would need libdbus-1-dev on Linux CI/build hosts.

0 commit comments

Comments
 (0)