Skip to content
This repository was archived by the owner on Apr 30, 2026. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
2665998
test(datalog): cmp + arith body literals work as documented
theaspirational Apr 18, 2026
717fab8
docs(plans): datalog aggregates + ray-exomem consumer refactor
theaspirational Apr 18, 2026
875e102
feat(datalog): declare DL_AGG body literal + aggregate operators
theaspirational Apr 18, 2026
6a5dc03
feat(datalog): dl_rule_add_agg builder for aggregate literals
theaspirational Apr 18, 2026
6a56026
fix(datalog): dl_rule_add_agg bumps rule->n_vars to match sibling bui…
theaspirational Apr 18, 2026
d37388d
feat(datalog): aggregates participate in stratification (non-monotonic)
theaspirational Apr 18, 2026
279a6d3
feat(datalog): evaluate count/sum/min/max/avg aggregates
theaspirational Apr 18, 2026
d211a5b
fix(datalog): guard non-i64 columns in aggregate fold; fix test refco…
theaspirational Apr 18, 2026
c78a34b
feat(datalog): MIN/MAX/AVG over empty source emit no row (match core …
theaspirational Apr 18, 2026
faaee1f
feat(datalog): grouped aggregation via ray_group
theaspirational Apr 18, 2026
e5c156a
refactor(datalog): A4.6 polish — DL_AGG_MAX_KEYS=8, drop redundant re…
theaspirational Apr 18, 2026
0c3652d
feat(datalog): surface syntax (count|sum|min|max|avg ?v pred [col] [b…
theaspirational Apr 18, 2026
6793bda
feat(datalog): float constants in expressions; AVG output promoted to…
theaspirational Apr 18, 2026
e8c43b7
feat(datalog): (between ?x lo hi) parser sugar lowers to two cmps
theaspirational Apr 18, 2026
c1933f8
test(datalog): aggregate empty-source and parser error-path tests
theaspirational Apr 18, 2026
c783977
test(datalog): mixed i64+f64 arithmetic promotion
theaspirational Apr 18, 2026
7351910
refactor(datalog): scope DL_AGG empty guard to SUM; doc arity re-reso…
theaspirational Apr 18, 2026
4724a6e
feat(datalog): auto-register env-bound EDBs for query rule bodies
theaspirational Apr 18, 2026
a626ccb
feat(datalog): typed head constants with broadcast projection
theaspirational Apr 19, 2026
a3fe8f3
test(datalog): head-const single-rule, i64, cross-IDB coverage
theaspirational Apr 19, 2026
8c68967
test(datalog): head-const f64, agg, negation, stratification, surface
theaspirational Apr 19, 2026
086a41a
feat(datalog): surface syntax accepts RAY_STR in body constant positions
theaspirational Apr 19, 2026
4e9b4de
fix(datalog): harden expression eval, head-const types, and grouped a…
theaspirational Apr 21, 2026
f80b228
test(datalog): grouped min/max/avg and env-bound EDB auto-registration
theaspirational Apr 21, 2026
0a32bd7
feat(runtime): ray_runtime_create_with_sym loads sym before builtins
theaspirational Apr 21, 2026
2478ab4
feat(runtime): harden sym_load ordering, error surfacing, and budget …
theaspirational Apr 21, 2026
ed48c3e
Update src/ops/datalog.c
singaraiona Apr 22, 2026
5696f04
Update src/ops/datalog.c
singaraiona Apr 22, 2026
a75a6a8
fix(datalog): env-bound EDB auto-register must accept any non-negativ…
Apr 22, 2026
2a2b8c2
fix(datalog): address Copilot review — sym width, agg bounds, f64 sca…
Apr 22, 2026
b2de655
fix(datalog): empty-source F64 SUM emits RAY_F64 identity, not RAY_I64 0
Apr 22, 2026
6d0929a
fix(datalog): scalar aggregate validates value col index before row-c…
Apr 22, 2026
ecdb0d9
fix(datalog): propagate dl_project errors; unknown-arity agg sentinel
Apr 23, 2026
06ddbc1
fix(datalog): dl_eval surfaces compile/runtime failures instead of sw…
Apr 23, 2026
ed2824b
fix(datalog): Phase B surfaces every runtime failure, not just ray_ex…
Apr 23, 2026
e294723
fix(datalog): table_union/distinct/antijoin never silently produce pa…
Apr 23, 2026
2b1846e
fix(datalog): table_union rejects schema mismatch instead of narrowing
Apr 23, 2026
3a025ce
fix(datalog): table_union schema check runs before empty-rows bypass
Apr 23, 2026
c321670
fix(datalog,runtime): address round-3 Copilot review
Apr 23, 2026
dc73dbd
fix(datalog,runtime): ray_error_free actually reclaims error blocks
Apr 23, 2026
e0bf053
fix(datalog,runtime): address round-4 Copilot review
Apr 23, 2026
973ee76
fix(datalog,runtime): address round-5 Copilot review
Apr 23, 2026
2759d34
fix(datalog): dl_filter_eq pass-through must retain (owned-ref contract)
Apr 23, 2026
86880ab
fix(datalog): surface head-const conflicts via eval_err, drop stderr …
Apr 23, 2026
916d84a
fix(datalog,runtime): address round-6+7 Copilot review
Apr 23, 2026
5b7949b
fix(datalog): check dl_filter_eq return at call sites
Apr 23, 2026
3ff6a7b
fix(datalog): round-8 — add_col leaks, broadcast SYM width, align eva…
Apr 23, 2026
d348677
fix(datalog): release out on const-slot add_col failure in dl_project
Apr 23, 2026
2cd5741
fix(datalog,runtime,docs): round-9 Copilot review
Apr 23, 2026
2ecae48
fix(datalog): DL_ASSIGN treats eval failures as unrecoverable
Apr 23, 2026
23a0cb5
fix(datalog,runtime): round-10 Copilot review
Apr 23, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1,012 changes: 1,012 additions & 0 deletions docs/plans/2026-04-18-datalog-aggregates-and-onboarding.md

Large diffs are not rendered by default.

5 changes: 5 additions & 0 deletions include/rayforce.h
Original file line number Diff line number Diff line change
Expand Up @@ -156,6 +156,11 @@ ray_t* ray_error(const char* code, const char* fmt, ...);
const char* ray_err_code_str(ray_err_t e);
ray_err_t ray_err_from_obj(ray_t* err);
const char* ray_err_code(ray_t* err);
/* Free a RAY_ERROR object. ray_release() is a deliberate no-op for
* error ray_t* (see src/mem/cow.c), so callers that hold the sole
* reference and want the block reclaimed must use this helper instead —
* otherwise the error leaks until heap teardown. */
void ray_error_free(ray_t* err);

/* ===== Accessor Macros ===== */

Expand Down
87 changes: 82 additions & 5 deletions src/core/runtime.c
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,8 @@
#include <stdarg.h>
#include <stdio.h>
#include <string.h>
#include <sys/stat.h>
#include <errno.h>
#ifdef RAY_OS_WINDOWS
#include <windows.h>
#else
Expand Down Expand Up @@ -137,6 +139,21 @@ ray_t* ray_error(const char* code, const char* fmt, ...) {
return err;
}

void ray_error_free(ray_t* err) {
/* Skip NULL and anything that isn't actually a RAY_ERROR — callers
* often pass a result that might be either an error or a real value. */
if (!err || !RAY_IS_ERR(err)) return;
/* Both ray_free and ray_release_owned_refs short-circuit on RAY_IS_ERR
* as a safety default (the refcount system deliberately does not track
* error objects). Retype the block to a leaf atom (-RAY_I64) so those
* guards don't fire — an atom with no owned children is the safest
* shape to pass through the standard free path. The rc was already
* 1 from ray_alloc, so ray_free will reclaim the block via the buddy
* allocator. From this point the caller must not touch err again. */
err->type = -RAY_I64;
ray_free(err);
}

const char* ray_err_code(ray_t* err) {
if (!err || err->type != RAY_ERROR) return NULL;
/* sdata is 7 bytes and may not be null-terminated when full */
Expand All @@ -157,14 +174,17 @@ void ray_error_clear(void) {

/* ===== Lifecycle ===== */

ray_runtime_t* ray_runtime_create(int argc, char** argv) {
(void)argc; (void)argv;
static ray_runtime_t* runtime_create_impl(const char* sym_path,
ray_err_t* out_sym_err) {
if (out_sym_err) *out_sym_err = RAY_OK;

/* Init subsystems */
ray_heap_init();
ray_sym_init();

/* Allocate runtime via system allocator */
/* Allocate runtime and set __VM + mem_budget BEFORE any file I/O so
* that ray_error() has a live VM to record diagnostics against and
* allocations are bounded by the budget. */
ray_runtime_t* rt = (ray_runtime_t*)ray_sys_alloc(sizeof(ray_runtime_t));
if (!rt) return NULL;
memset(rt, 0, sizeof(*rt));
Expand Down Expand Up @@ -196,13 +216,70 @@ ray_runtime_t* ray_runtime_create(int argc, char** argv) {
rt->mem_budget = (int64_t)(4ULL << 30);
#endif

/* Init language (env + builtins) — must be after __VM is set */
/* __RUNTIME must be visible before ray_sym_load so mem_budget checks
* and ray_error() both operate against the live runtime. */
__RUNTIME = rt;

/* Load persisted symbol table BEFORE ray_lang_init interns builtins.
* Ordering: __VM + mem_budget are live so file I/O errors surface via
* ray_error() and allocations are budget-bounded. Still before
* ray_lang_init so persisted user symbol IDs keep their slots and
* builtins append afterwards. */
if (sym_path) {
/* Pre-flight size check: reject files that would blow past the
* memory budget before ever touching ray_col_load.
*
* errno handling: ENOENT is the normal first-run case and stays
* RAY_OK; any *other* stat failure (EACCES, ENOTDIR, EIO, …) is
* a real problem and must be surfaced as RAY_ERR_IO, otherwise
* the caller would silently continue with an empty sym table
* and later hit the "divergence" class of bugs this entrypoint
* was added to avoid. */
struct stat st;
if (stat(sym_path, &st) == 0) {
/* Allow the sym file itself plus some working headroom (2x).
* A well-formed sym file is a list of interned strings; the
* in-memory footprint is bounded by file size within a small
* constant factor. */
if (st.st_size > 0 &&
(int64_t)st.st_size > rt->mem_budget / 2) {
if (out_sym_err) *out_sym_err = RAY_ERR_OOM;
/* Continue startup with empty sym table; caller decides
* whether to treat this as fatal. */
} else {
ray_err_t sym_err = ray_sym_load(sym_path);
if (out_sym_err) *out_sym_err = sym_err;
/* RAY_ERR_CORRUPT and I/O errors are non-fatal here:
* caller inspects out_sym_err to decide recovery. */
}
} else if (errno != ENOENT) {
if (out_sym_err) *out_sym_err = RAY_ERR_IO;
}
/* ENOENT: leave out_sym_err = RAY_OK — absent sym file is the
* normal first-run case. */
}
Comment on lines +228 to +260
Copy link

Copilot AI Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

runtime_create_impl treats any stat() failure as “no sym file” (out_sym_err stays RAY_OK) and skips ray_sym_load entirely. That hides real I/O problems (e.g., EACCES) and defeats the whole "load sym before builtins" guarantee for persistent consumers. Consider distinguishing ENOENT from other errors and either (a) attempt ray_sym_load directly and rely on its error codes, or (b) set out_sym_err appropriately for non-ENOENT stat failures.

Copilot uses AI. Check for mistakes.
Comment on lines +228 to +260
Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

runtime_create_impl(): when stat(sym_path, …) fails, the code currently leaves out_sym_err = RAY_OK and skips ray_sym_load entirely. That hides real I/O errors (EACCES, ENOTDIR, etc.) and contradicts the header comment that the _err variant surfaces I/O failures. Consider: (1) treat errno==ENOENT as OK/first-run, but for other errno set *out_sym_err = RAY_ERR_IO; and/or (2) call ray_sym_load regardless and only use stat for the size pre-flight when it succeeds.

Copilot uses AI. Check for mistakes.

/* Init language (env + builtins) — must be after __VM is set and
* after sym_load so persisted user IDs keep their slots. */
ray_lang_init();

__RUNTIME = rt;
return rt;
}

ray_runtime_t* ray_runtime_create(int argc, char** argv) {
(void)argc; (void)argv;
return runtime_create_impl(NULL, NULL);
}

ray_runtime_t* ray_runtime_create_with_sym(const char* sym_path) {
return runtime_create_impl(sym_path, NULL);
}

ray_runtime_t* ray_runtime_create_with_sym_err(const char* sym_path,
ray_err_t* out_sym_err) {
return runtime_create_impl(sym_path, out_sym_err);
}
Comment thread
singaraiona marked this conversation as resolved.

Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new public entrypoints ray_runtime_create_with_sym() / ray_runtime_create_with_sym_err() are defined here, but they aren’t declared in src/core/runtime.h or the public include/rayforce.h (tests work around this via manual externs). Please add the prototypes to the appropriate headers so consumers can use the APIs without duplicating declarations, and so signature changes can’t silently drift.

Suggested change
/* NOTE: These public entrypoints must also be declared in src/core/runtime.h
* and include/rayforce.h to keep the API surface synchronized.
*/

Copilot uses AI. Check for mistakes.
/* ===== Memory Budget API ===== */

int64_t ray_mem_budget(void) {
Expand Down
10 changes: 10 additions & 0 deletions src/core/runtime.h
Original file line number Diff line number Diff line change
Expand Up @@ -103,10 +103,20 @@ extern _Thread_local ray_vm_t *__VM;
ray_runtime_t* ray_runtime_create(int argc, char** argv);
void ray_runtime_destroy(ray_runtime_t* rt);

/* Persistent-consumer lifecycle: load the sym table from `sym_path` (if
* present) before builtins register, so user-interned IDs keep the same
* slots across process restarts. The _err variant surfaces the load
* result via `out_sym_err` (RAY_OK / RAY_ERR_CORRUPT / I/O errors) so
* callers can decide recovery policy; the plain variant discards it. */
ray_runtime_t* ray_runtime_create_with_sym(const char* sym_path);
ray_runtime_t* ray_runtime_create_with_sym_err(const char* sym_path,
ray_err_t* out_sym_err);
Comment on lines +106 to +113
Copy link

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New public APIs ray_runtime_create_with_sym / ray_runtime_create_with_sym_err change init ordering and error semantics, but there are still no tests exercising the key cases (load-before-builtins stability, corrupt file -> RAY_ERR_CORRUPT, oversized file -> RAY_ERR_OOM, and non-ENOENT I/O failure -> RAY_ERR_IO). Adding targeted runtime/sym integration tests would help prevent regressions in persistent-consumer scenarios.

Copilot uses AI. Check for mistakes.

/* Error API — allocates ray_t with type=RAY_ERROR, sets __VM->err.msg */
ray_t* ray_error(const char* code, const char* fmt, ...);
/* Read error code from a RAY_ERROR object (returns pointer to sdata) */
const char* ray_err_code(ray_t* err);
/* ray_error_free() is published in include/rayforce.h */

/* Read VM error detail message (NULL if empty) */
const char* ray_error_msg(void);
Expand Down
Loading
Loading