diff --git a/cpp/NOTES.md b/cpp/NOTES.md new file mode 100644 index 00000000..4a774438 --- /dev/null +++ b/cpp/NOTES.md @@ -0,0 +1,22 @@ +# C++ Implementation Notes + +## undefined vs null + +C++ (via nlohmann::json) has only `null` (JSON null) — there is no native distinction between +"absent" and "null". +For this library: +- `nullptr` / `json(nullptr)` is used to represent **property absence** (the TypeScript + `undefined` equivalent). +- TypeScript tests relating to `undefined` should be treated as **property absence**: the key + does not exist in the JSON object, or the function parameter was not provided. +- JSON null is ambiguous with absent. Where the distinction matters, the test runner should use + marker strings: `NULLMARK = "__NULL__"` for JSON null and `UNDEFMARK = "__UNDEF__"` for absent values. +- Consider using `std::optional` to distinguish absent from null where needed. +- In practice, most APIs do not use JSON null, so this ambiguity rarely causes issues. + +## Type System + +This implementation uses bitfield integers for the type system, matching the TypeScript canonical. +Type constants (`T_any`, `T_noval`, `T_boolean`, etc.) are defined in the `VoxgigStruct` namespace +and `typify()` returns integer bitfields. Use `typename()` to get the human-readable name for +error messages. Bitwise operations allow composite type checks (e.g., `T_scalar | T_string`). diff --git a/cpp/REVIEW.md b/cpp/REVIEW.md new file mode 100644 index 00000000..3266358d --- /dev/null +++ b/cpp/REVIEW.md @@ -0,0 +1,234 @@ +# C++ (cpp) - Review vs TypeScript Canonical + +## Overview + +The C++ version is the **most incomplete** implementation, with only **~18 functions** out of 40+. It covers basic type checking, property access, walk, merge, stringify, and clone. All major subsystems (getpath, setpath, inject, transform, validate, select) are **entirely missing**. The API design uses an unusual pattern where all functions take `args_container&&` (vector of JSON values), which differs significantly from all other implementations. + +--- + +## Missing Functions + +### Critical (Core Operations - All Missing) +| Function | Category | Impact | +|----------|----------|--------| +| `getpath` | Path operations | Cannot navigate nested structures by path | +| `setpath` | Path operations | Cannot set values at nested paths | +| `inject` | Injection | No value injection from store | +| `transform` | Transform | No data transformation capability | +| `validate` | Validation | No data validation capability | +| `select` | Query | No query/filter on children | + +### Minor Utilities (All Missing) +| Function | Category | Impact | +|----------|----------|--------| +| `getelem` | Property access | No negative-index element access | +| `getdef` | Property access | No defined-or-default helper | +| `delprop` | Property access | No dedicated property deletion | +| `size` | Collection | No unified size function | +| `slice` | Collection | No array/string slicing | +| `flatten` | Collection | No array flattening | +| `filter` | Collection | No predicate filtering | +| `pad` | String | No string padding | +| `replace` | String | No unified string replace | +| `join` | String | No general join function | +| `jsonify` | Serialization | No JSON formatting | +| `strkey` | String | No key-to-string conversion | +| `typename` | Type system | No type name function | +| `typify` | Type system | No type identification function | +| `pathify` | String | No path-to-string conversion | +| `jm`/`jt` | JSON builders | No JSON builder functions | +| `checkPlacement` | Advanced | No placement validation | +| `injectorArgs` | Advanced | No injector argument validation | +| `injectChild` | Advanced | No child injection helper | + +--- + +## Architectural Issues + +### 1. All Functions Take `args_container&&` + +- **TS**: Functions have named, typed parameters (e.g., `isnode(val: any)`). +- **C++**: All functions take `args_container&&` (aka `std::vector&&`), extracting parameters by position from the vector. +- **Impact**: + - No compile-time parameter validation. + - No IDE autocompletion for parameters. + - Runtime errors for wrong argument count/types. + - Cannot distinguish between functions by signature. + - Makes the API feel like a scripting language dispatch table rather than a C++ library. +- **Recommendation**: Consider proper C++ function signatures with typed parameters. The `args_container` pattern should only be used for the test runner dispatch, not for the public API. + +### 2. `walk` Uses `intptr_t` to Pass Function Pointers Through JSON + +- **TS**: `walk(val, apply)` where `apply` is a function. +- **C++**: The apply function pointer is cast to `intptr_t`, stored in a JSON number, and cast back when needed. +- **Impact**: + - **Extremely unsafe** - pointer-as-integer casting through JSON is undefined behavior. + - Breaks if JSON library modifies the number (e.g., float conversion). + - Not portable across architectures. + - No type safety for the callback. +- **Recommendation**: Redesign walk to take the callback as a separate parameter, not embedded in the args vector. + +### 3. No Injection System + +- No `Injection` class or equivalent. +- No injection state management. +- No type constants. +- No `SKIP`/`DELETE` sentinels. + +### 4. `clone` Is Shallow + +- **TS**: Deep clones via `JSON.parse(JSON.stringify(val))`. +- **C++**: Simple JSON copy (which is deep for nlohmann::json, but the implementation returns `nullptr` for null, suggesting it may not be handling all cases). + +--- + +## Existing Function Issues + +### 1. `isfunc` Uses Template Specialization + +- **TS**: Simple `typeof val === 'function'` check. +- **C++**: Complex template specialization that returns `true` for `std::function` and `false` for everything else. +- **Impact**: Only detects one specific function type. Cannot detect lambdas, function pointers, or other callables. + +### 2. `iskey` Handles Booleans + +- The C++ version explicitly returns `false` for booleans in `iskey`, which is correct (matching TS behavior since `typeof true` is not `"string"` or `"number"` in JS). Good. + +### 3. `setprop` Array Handling + +- Uses direct vector pointer manipulation (`(*it).data()`) for array operations. +- **Impact**: May have memory safety issues with iterator invalidation. + +### 4. `stringify` Is Minimal + +- Basic `dump()` call with quote stripping and optional truncation. +- Missing: sorted keys, custom formatting, depth handling. + +--- + +## Significant Language Difference Issues + +### 1. No Dynamic Typing + +- **Issue**: C++ is statically typed. The library uses `nlohmann::json` to provide dynamic JSON values, but C++ has no native equivalent of JavaScript's dynamic typing. +- **Impact**: Every operation requires type checking at runtime through the JSON library's type system. This is verbose but functional. + +### 2. No `undefined` vs `null` Distinction + +- **Issue**: `nlohmann::json` has `null` but no `undefined`. +- **Impact**: Same as all other non-JS implementations. + +### 3. Memory Management + +- **Issue**: C++ requires explicit memory management. The `nlohmann::json` type handles its own memory, but function pointers, callbacks, and the `Utility`/`Provider` classes have manual memory management. +- **Impact**: Potential for memory leaks or use-after-free in the `Provider` and `Utility` classes. + +### 4. No Garbage Collection + +- **Issue**: Circular references in data structures cannot be automatically collected. +- **Impact**: The `walk` function must be careful about reference cycles. The `inject` system (when implemented) must handle cycles explicitly. + +### 5. No Regular Expression Literals + +- **Issue**: C++ uses `` library with string-based patterns. +- **Impact**: `escre` manually escapes characters (correct approach). + +### 6. No Closures as First-Class Citizens (Pre-C++11) + +- **Issue**: C++11 lambdas exist but are not JSON-serializable. The current approach of casting function pointers to integers is extremely fragile. +- **Impact**: The callback/handler system needs a completely different design from TS. + +### 7. No Exception-Safe Error Collection + +- **Issue**: C++ exception handling is more expensive than JS try/catch. The validate system (when implemented) should prefer error collection over exceptions. +- **Impact**: Design consideration for future implementation. + +### 8. Template Metaprogramming Complexity + +- **Issue**: The `isfunc` template specialization pattern is complex and fragile. Adding new callable types requires new specializations. +- **Impact**: Consider using a simpler runtime check instead. + +--- + +## Test Coverage + +Minimal test coverage: +- Minor function tests: `isnode`, `ismap`, `islist`, `iskey`, `isempty`, `isfunc`, `getprop`, `keysof`, `haskey`, `items`, `escre`, `escurl`, `joinurl`, `stringify`, `clone`, `setprop` +- Walk: `walk-basic` only +- Merge: `merge-basic` only +- **No tests for**: getpath, setpath, inject, transform, validate, select (functions don't exist) + +Uses Catch2-style test framework with shared `test.json` spec. + +--- + +## Alignment Plan + +### Phase 1: API Redesign (Critical - Do First) +1. **Redesign function signatures** to use proper C++ parameters instead of `args_container&&` + - Example: `json isnode(const json& val)` instead of `json isnode(args_container&& args)` + - Keep `args_container` dispatch only for the test runner +2. **Remove intptr_t function pointer casting** from `walk` + - Pass callback as `std::function&)>` +3. **Fix isfunc** to use a runtime callable check or a dedicated `JsonFunction` wrapper + +### Phase 2: Missing Minor Functions +4. Add `typify(val)` returning bitfield integers +5. Add all type constants (`T_any`, `T_noval`, `T_boolean`, etc.) +6. Add `typename(t)` function +7. Add `strkey(key)` function +8. Add `getelem(val, key, alt)` with negative index support +9. Add `getdef(val, alt)` helper +10. Add `delprop(parent, key)` function +11. Add `size(val)` function +12. Add `slice(val, start, end)` function +13. Add `flatten(list, depth)` function +14. Add `filter(val, check)` function +15. Add `pad(str, padding, padchar)` function +16. Add `replace(s, from, to)` function +17. Add `join(arr, sep, url)` function +18. Add `jsonify(val, flags)` function +19. Add `pathify(val, startin, endin)` function +20. Add `jm`/`jt` JSON builder functions +21. Add `SKIP` and `DELETE` sentinel values + +### Phase 3: Path Operations +22. Implement `getpath(store, path, injdef)` with full path syntax support +23. Implement `setpath(store, path, val, injdef)` + +### Phase 4: Injection System +24. Design and implement `Injection` class/struct +25. Implement `inject(val, store, injdef)` with full injection system +26. Implement `injectChild(child, store, inj)` +27. Add `checkPlacement` and `injectorArgs` functions + +### Phase 5: Transform +28. Implement `transform(data, spec, injdef)` with all commands: + - `$DELETE`, `$COPY`, `$KEY`, `$ANNO`, `$MERGE`, `$EACH`, `$PACK` + - `$REF`, `$FORMAT`, `$APPLY`, `$BT`, `$DS`, `$WHEN` + +### Phase 6: Validate +29. Implement `validate(data, spec, injdef)` with all validators: + - `$MAP`, `$LIST`, `$STRING`, `$NUMBER`, `$INTEGER`, `$DECIMAL` + - `$BOOLEAN`, `$NULL`, `$NIL`, `$FUNCTION`, `$INSTANCE`, `$ANY` + - `$CHILD`, `$ONE`, `$EXACT` + +### Phase 7: Select +30. Implement `select(children, query)` with operators: + - `$AND`, `$OR`, `$NOT`, `$GT`, `$LT`, `$GTE`, `$LTE`, `$LIKE` + +### Phase 8: Walk Enhancement +31. Add `before`/`after` callback support to `walk` +32. Add `maxdepth` parameter + +### Phase 9: Test Coverage +33. Add tests for all new functions using shared `test.json` +34. Add all test categories matching TS suite +35. Ensure memory safety (run with AddressSanitizer) +36. Ensure no undefined behavior (run with UBSan) + +### Phase 10: Code Quality +37. Add proper error handling (not exceptions for expected cases) +38. Review memory management in Utility/Provider classes +39. Add const-correctness throughout +40. Consider using `std::optional` for absent values diff --git a/cpp/src/utility_decls.hpp b/cpp/src/utility_decls.hpp index 1ba96ffc..b952de39 100644 --- a/cpp/src/utility_decls.hpp +++ b/cpp/src/utility_decls.hpp @@ -5,6 +5,7 @@ #include #include +#include #include diff --git a/cpp/src/voxgig_struct.hpp b/cpp/src/voxgig_struct.hpp index 242e3db3..81b7b2fd 100644 --- a/cpp/src/voxgig_struct.hpp +++ b/cpp/src/voxgig_struct.hpp @@ -7,8 +7,99 @@ namespace VoxgigStruct { namespace S { const std::string empty = ""; + const std::string any = "any"; + const std::string nil = "nil"; + const std::string boolean_s = "boolean"; + const std::string decimal = "decimal"; + const std::string integer = "integer"; + const std::string number = "number"; + const std::string string_s = "string"; + const std::string function_s = "function"; + const std::string symbol = "symbol"; + const std::string null_s = "null"; + const std::string list = "list"; + const std::string map = "map"; + const std::string instance = "instance"; + const std::string scalar = "scalar"; + const std::string node = "node"; + const std::string viz = ": "; }; + // Type constants - bitfield integers matching TypeScript canonical. + constexpr int T_any = (1 << 31) - 1; + constexpr int T_noval = 1 << 30; + constexpr int T_boolean = 1 << 29; + constexpr int T_decimal = 1 << 28; + constexpr int T_integer = 1 << 27; + constexpr int T_number = 1 << 26; + constexpr int T_string = 1 << 25; + constexpr int T_function = 1 << 24; + constexpr int T_symbol = 1 << 23; + constexpr int T_null = 1 << 22; + constexpr int T_list = 1 << 14; + constexpr int T_map = 1 << 13; + constexpr int T_instance = 1 << 12; + constexpr int T_scalar = 1 << 7; + constexpr int T_node = 1 << 6; + + const std::string TYPENAME[] = { + S::any, S::nil, S::boolean_s, S::decimal, S::integer, S::number, S::string_s, + S::function_s, S::symbol, S::null_s, + "", "", "", "", "", "", "", + S::list, S::map, S::instance, + "", "", "", "", + S::scalar, S::node, + }; + constexpr int TYPENAME_LEN = 26; + + // Get type name string from type bitfield value. + inline std::string typename_of(int t) { + std::string tname = ""; + for (int tI = 0; tI < TYPENAME_LEN; tI++) { + if (!TYPENAME[tI].empty() && 0 < (t & (1 << (31 - tI)))) { + tname = TYPENAME[tI]; + } + } + return tname; + } + + // Determine the type of a value as a bitfield integer. + inline int typify(const json& value) { + if (value.is_null()) { + return T_noval; + } + + if (value.is_boolean()) { + return T_scalar | T_boolean; + } + + if (value.is_number_integer()) { + return T_scalar | T_number | T_integer; + } + + if (value.is_number_float()) { + double d = value.get(); + if (std::isnan(d)) { + return T_noval; + } + return T_scalar | T_number | T_decimal; + } + + if (value.is_string()) { + return T_scalar | T_string; + } + + if (value.is_array()) { + return T_node | T_list; + } + + if (value.is_object()) { + return T_node | T_map; + } + + return T_any; + } + inline json isnode(args_container&& args) { json val = args.size() == 0 ? nullptr : std::move(args[0]); diff --git a/go/NOTES.md b/go/NOTES.md new file mode 100644 index 00000000..ffbc1dbb --- /dev/null +++ b/go/NOTES.md @@ -0,0 +1,19 @@ +# Go Implementation Notes + +## undefined vs null + +Go has only `nil` — there is no native distinction between "absent" and "null". +For this library: +- `nil` is used to represent **property absence** (the TypeScript `undefined` equivalent). +- TypeScript tests relating to `undefined` should be treated as **property absence**: the key + does not exist in the map, or the function parameter was not provided. +- JSON null is ambiguous with `nil`. Where the distinction matters, the test runner uses + marker strings: `NULLMARK = "__NULL__"` for JSON null and `UNDEFMARK = "__UNDEF__"` for absent values. +- In practice, most APIs do not use JSON null, so this ambiguity rarely causes issues. + +## Type System + +This implementation uses bitfield integers for the type system, matching the TypeScript canonical. +Type constants (`T_any`, `T_noval`, `T_boolean`, etc.) are exported and `Typify()` returns +integer bitfields. Use `Typename()` to get the human-readable name for error messages. +Bitwise operations allow composite type checks (e.g., `T_scalar | T_string`). diff --git a/go/REVIEW.md b/go/REVIEW.md new file mode 100644 index 00000000..c2880702 --- /dev/null +++ b/go/REVIEW.md @@ -0,0 +1,159 @@ +# Go (go) - Review vs TypeScript Canonical + +## Overview + +The Go version is one of the most extensive implementations, with **50+ exported functions** - actually exceeding the TypeScript canonical in some areas. It uses the unified `*Injection` struct pattern and has comprehensive type constants. The main differences stem from Go's static type system, lack of generics (prior to 1.18), and explicit error handling. + +--- + +## Extra Functions (Not in TS) + +| Function | Purpose | Notes | +|----------|---------|-------| +| `CloneFlags` | Clone with options (func, wrap, unwrap) | TS `clone` is simpler | +| `WalkDescend` | Walk with explicit path tracking | TS combines this into `walk` | +| `TransformModify` | Transform with modify function | TS uses `injdef.modify` | +| `TransformModifyHandler` | Transform with handler+modify | TS uses `injdef` | +| `TransformCollect` | Transform returning errors | TS uses `injdef.errs` | +| `ItemsApply` | Items with apply function | TS overloads `items` | +| `ListRefCreate` | Generic list reference | Go-specific utility | + +These extra functions exist because Go cannot use optional parameters or function overloading. They are acceptable language adaptations. + +--- + +## Missing Functions + +| Function | Category | Impact | +|----------|----------|--------| +| `replace` | String | No unified string replace wrapper | +| `jm` | JSON builders | Named `Jo` instead | +| `jt` | JSON builders | Named `Ja` instead | + +--- + +## Naming Differences + +| TS Name | Go Name | Notes | +|---------|---------|-------| +| `jm` | `Jo` | Go convention: exported, but different name | +| `jt` | `Ja` | Go convention: exported, but different name | +| `joinurl` | `JoinUrl` | Capitalized per Go convention | +| All functions | PascalCase | Go requires exported names to be capitalized | + +--- + +## API Signature Differences + +### 1. `Injection` is a struct with pointer semantics + +- **TS**: `Injection` is a class with methods. +- **Go**: `*Injection` is a struct passed by pointer with methods. +- **Notes**: Functionally equivalent. Go's approach is idiomatic. + +### 2. `Validate` returns `(any, error)` tuple + +- **TS**: Returns data; throws on error or collects in `injdef.errs`. +- **Go**: Returns `(any, error)` - Go's standard error pattern. +- **Notes**: Idiomatic Go adaptation. The `error` return replaces throwing. + +### 3. `Walk` takes `WalkApply` function type with `*string` key + +- **TS**: `WalkApply = (key: string | number | undefined, val, parent, path) => any` +- **Go**: `WalkApply func(key *string, val any, parent any, path []string) any` +- **Notes**: Go uses `*string` (pointer) to represent optional key (nil = no key). This is a reasonable adaptation since Go has no union types. + +### 4. Variadic parameters replace optional parameters + +- **TS**: `getprop(val, key, alt?)` with optional `alt`. +- **Go**: `GetProp(val any, key any, alts ...any)` with variadic. +- **Notes**: Go doesn't support optional parameters; variadic is the idiomatic replacement. + +### 5. `ListRef` generic wrapper for mutable list references + +- **TS**: Uses plain arrays, passed by reference (JS semantics). +- **Go**: Uses `ListRef[T]` struct with `Append`/`Prepend` methods for `Keys`, `Path`, `Nodes`, `Errs` in the Injection struct. +- **Notes**: Required because Go slices are value types. This is a necessary language adaptation. + +### 6. `Items` returns `[][2]any` instead of `[string, any][]` + +- **TS**: Returns array of `[string, any]` tuples. +- **Go**: Returns `[][2]any` (array of 2-element arrays). +- **Notes**: Go has no tuple type; fixed-size array is the closest equivalent. + +--- + +## Significant Language Difference Issues + +### 1. No `undefined` vs `null` Distinction + +- **Issue**: Go has only `nil`. There is no way to distinguish "absent" from "null" at the type level. +- **Workaround**: The test runner uses `NULLMARK`/`UNDEFMARK` string markers. +- **Impact**: Same as Python - inherent limitation requiring careful handling. + +### 2. Type Assertions Required for Dynamic Access + +- **Issue**: Go's static type system requires type assertions (`val.(map[string]any)`) for dynamic JSON-like data. This adds verbosity and runtime panic risk. +- **Impact**: The implementation uses `any` extensively, trading type safety for flexibility. This is the standard Go approach for JSON manipulation. + +### 3. No Function Overloading + +- **Issue**: Go doesn't support function overloading, leading to separate functions like `Items`/`ItemsApply`, `Walk`/`WalkDescend`, and multiple `Transform*` variants. +- **Impact**: API surface is larger but each function is simpler. Acceptable trade-off. + +### 4. Map Iteration Order is Non-deterministic + +- **Issue**: Go maps don't guarantee iteration order. +- **Workaround**: `KeysOf` sorts keys, and operations that iterate maps use sorted keys. +- **Impact**: Correctly handled. + +### 5. No Generics for JSON Value Types (Pre-1.18) + +- **Issue**: JSON values are `any` (interface{}), requiring type switches/assertions everywhere. +- **Impact**: More verbose code but functionally equivalent. `ListRef[T]` uses generics (Go 1.18+). + +### 6. Integer Types + +- **Issue**: Go has multiple integer types (`int`, `int64`, `float64`). JSON numbers from `encoding/json` decode as `float64` by default. +- **Impact**: Must carefully handle `float64` vs `int` conversions. The `typify` function needs to check if a `float64` is actually an integer. + +### 7. No Regular Expression Literals + +- **Issue**: Go uses `regexp.Compile()` instead of `/pattern/` literals. +- **Impact**: Regex operations in `select` and string matching work differently but are functionally equivalent. + +--- + +## Test Coverage + +Go tests are comprehensive, covering all categories: +- Minor functions, walk, merge, getpath, inject, transform, validate, select. +- Uses shared `test.json` spec via the test runner framework. +- Has additional test infrastructure (`testutil` package) with SDK, Runner, and Direct testing. + +--- + +## Alignment Plan + +### Phase 1: Naming Alignment (Low Priority) +1. Consider adding `Jm`/`Jt` aliases for `Jo`/`Ja` to match TS naming +2. Add `Replace(s, from, to)` function if missing + +### Phase 2: API Review +3. Verify all `Transform*` variants produce identical results to TS `transform` with `injdef` +4. Review `Validate` error messages match TS format exactly +5. Ensure `Select` operator behavior matches TS for all edge cases + +### Phase 3: Type System Verification +6. Verify `Typify` correctly distinguishes `float64` integers from true floats +7. Ensure type constant values match TS exactly (same bit positions) +8. Test `T_instance` detection for Go struct types + +### Phase 4: Edge Case Alignment +9. Run full test suite comparison against TS test.json +10. Verify `nil` handling matches TS `undefined`/`null` semantics in all contexts +11. Check `Clone`/`CloneFlags` behavior matches TS `clone` for functions and nested structures + +### Phase 5: Simplification (Optional) +12. Consider whether `TransformModify`, `TransformModifyHandler`, `TransformCollect` can be consolidated with a more Go-idiomatic options pattern +13. Document the rationale for `ListRef` and other Go-specific adaptations diff --git a/java/NOTES.md b/java/NOTES.md new file mode 100644 index 00000000..7c854302 --- /dev/null +++ b/java/NOTES.md @@ -0,0 +1,21 @@ +# Java Implementation Notes + +## undefined vs null + +Java has only `null` — there is no native distinction between "absent" and "null". +For this library: +- `null` is used to represent **property absence** (the TypeScript `undefined` equivalent). +- TypeScript tests relating to `undefined` should be treated as **property absence**: the key + does not exist in the Map, or the function parameter was not provided. +- JSON null is ambiguous with `null`. Where the distinction matters, the test runner should use + marker strings: `NULLMARK = "__NULL__"` for JSON null and `UNDEFMARK = "__UNDEF__"` for absent values. +- A sentinel object (e.g., `static final Object UNDEF = new Object()`) may be used internally + to distinguish absent from null where required. +- In practice, most APIs do not use JSON null, so this ambiguity rarely causes issues. + +## Type System + +This implementation uses bitfield integers for the type system, matching the TypeScript canonical. +Type constants (`T_any`, `T_noval`, `T_boolean`, etc.) are defined as static fields and `typify()` +returns integer bitfields. Use `typename()` to get the human-readable name for error messages. +Bitwise operations allow composite type checks (e.g., `T_scalar | T_string`). diff --git a/java/REVIEW.md b/java/REVIEW.md new file mode 100644 index 00000000..04f3b50d --- /dev/null +++ b/java/REVIEW.md @@ -0,0 +1,229 @@ +# Java (java) - Review vs TypeScript Canonical + +## Overview + +The Java version is **severely incomplete**. It implements only ~20 basic functions out of the 40+ in the TypeScript canonical. Major subsystems (`getpath`, `setpath`, `inject`, `transform`, `validate`, `select`) are either missing or only partially stubbed. The test suite is a minimal placeholder. This is the **least mature** implementation alongside C++. + +--- + +## Missing Functions + +### Critical (Core Operations) +| Function | Category | Impact | +|----------|----------|--------| +| `getpath` | Path operations | Cannot navigate nested structures by path | +| `setpath` | Path operations | Cannot set values at nested paths | +| `inject` | Injection | No value injection from store | +| `transform` | Transform | No data transformation capability | +| `validate` | Validation | No data validation capability | +| `select` | Query | No query/filter on children | +| `merge` | Data manipulation | No multi-object merging | + +### Minor Utilities +| Function | Category | Impact | +|----------|----------|--------| +| `getelem` | Property access | No negative-index list access | +| `getdef` | Property access | No defined-or-default helper | +| `delprop` | Property access | No dedicated property deletion | +| `size` | Collection | No unified size function | +| `slice` | Collection | No array/string slicing | +| `flatten` | Collection | No array flattening | +| `filter` | Collection | No predicate filtering | +| `pad` | String | No string padding | +| `replace` | String | No unified string replace | +| `join` | String | No general join function | +| `jsonify` | Serialization | No JSON serialization with formatting | +| `strkey` | String | No key-to-string conversion | +| `typename` | Type system | No type name function | +| `typify` | Type system | No type identification function | +| `jm`/`jt` | JSON builders | No JSON builder functions | +| `checkPlacement` | Advanced | No placement validation | +| `injectorArgs` | Advanced | No injector argument validation | +| `injectChild` | Advanced | No child injection helper | + +--- + +## Existing Function Differences + +### 1. `isFunc` checks for `Runnable` instead of general callable + +- **TS**: `isfunc(val)` checks `typeof val === 'function'`. +- **Java**: `isFunc(val)` checks `val instanceof Runnable`. +- **Impact**: Misses `Callable`, `Function`, lambda expressions, and method references. Should check for `java.util.function.Function` or a custom functional interface. + +### 2. `items` returns `List>` + +- **TS**: Returns `[string, any][]` - array of tuples with string keys. +- **Java**: Returns `List>` - Map entries with Object keys. +- **Impact**: Keys are not consistently strings. List indices should be returned as string keys to match TS. + +### 3. `keysof` returns zeros for lists + +- **TS**: Returns string indices (`["0", "1", "2"]`) for lists. +- **Java**: Returns a list of zeros sized to the list length. +- **Impact**: **Incorrect behavior**. This is a bug. + +### 4. `hasKey` delegates to `getProp` null check + +- **TS**: Checks if property is defined (not undefined). +- **Java**: Checks if `getProp` returns non-null. +- **Impact**: Cannot distinguish "key exists with null value" from "key doesn't exist". + +### 5. `setProp` deletes on null + +- **TS**: Has separate `delprop`; `setprop` with `DELETE` sentinel deletes. +- **Java**: `setProp` with `null` value deletes the key. +- **Impact**: Cannot set a property to `null` (JSON null). + +### 6. `pathify` has different default `from` parameter + +- **TS**: `pathify(val, startin=0, endin=0)` - starts from index 0 by default. +- **Java**: `pathify(val, from)` with `from` defaulting to 1 in usage. +- **Impact**: Off-by-one behavior difference. + +### 7. `walk` is post-order only + +- **TS**: `walk(val, before?, after?, maxdepth?)` - supports pre-order and post-order. +- **Java**: `walk(val, apply, key, parent, path)` - post-order only, no `maxdepth`. +- **Impact**: Cannot do pre-order transformations; no depth protection. + +### 8. `clone` does not use JSON round-trip + +- **TS**: Uses `JSON.parse(JSON.stringify(val))` with function preservation. +- **Java**: Recursively copies Maps and Lists; primitives returned as references. +- **Impact**: May not correctly deep-clone nested objects that aren't Map/List. + +### 9. `escapeRegex` uses `Pattern.quote()` + +- **TS**: Manually escapes special regex characters. +- **Java**: Uses `Pattern.quote()` which wraps in `\Q...\E`. +- **Impact**: Different escaping mechanism; may behave differently in edge cases. + +### 10. `stringify` uses `Objects.toString()` + +- **TS**: Custom implementation with sorted keys, quote removal, depth handling. +- **Java**: Simple `Objects.toString()` with quote removal. +- **Impact**: Output format will differ significantly for complex objects. + +--- + +## Structural/Architectural Gaps + +### No Injection System +- No `Injection` class or equivalent. +- `InjectMode` enum exists but is unused. +- No injection state management. + +### No Type System +- No bitfield type constants (`T_any`, `T_string`, etc.). +- No `typify` or `typename` functions. +- No type discrimination beyond basic `instanceof`. + +### No SKIP/DELETE Sentinels +- No `SKIP` sentinel. +- `DELETE` sentinel is missing (deletion via null in `setProp`). + +### Minimal Test Infrastructure +- `StructTest.java` is a placeholder (prints "1"). +- `Runner.java` has framework code but `TestSubject.invoke()` only handles `isNode`. +- No actual test execution against `test.json`. + +--- + +## Significant Language Difference Issues + +### 1. No Equivalent of `undefined` + +- **Issue**: Java has only `null`. Cannot distinguish "absent" from "JSON null". +- **Recommendation**: Use a sentinel object (e.g., `static final Object UNDEF = new Object()`) similar to the Python approach. + +### 2. Type Erasure with Generics + +- **Issue**: Java generics are erased at runtime. Cannot distinguish `List` from `List` at runtime. +- **Impact**: Type checking in `typify` must use `instanceof` checks on values, not generic type parameters. + +### 3. No Dynamic Property Access + +- **Issue**: Java objects don't support dynamic property access. Must use `Map` for JSON-like structures. +- **Impact**: All "map" operations must work with `Map` interface. No dot-notation property access. + +### 4. No First-Class Functions (Pre-Java 8) + +- **Issue**: Java uses functional interfaces (`Function`, `BiFunction`, custom interfaces) instead of first-class functions. +- **Impact**: Callbacks for `walk`, `inject`, `transform` need well-designed functional interfaces. Current `Runnable` check in `isFunc` is wrong. + +### 5. Checked Exceptions + +- **Issue**: `escapeUrl` declares `throws UnsupportedEncodingException` (which can't actually happen with UTF-8). +- **Impact**: Forces callers to handle checked exceptions unnecessarily. + +### 6. No Spread/Rest Parameters + +- **Issue**: Java has varargs (`Object...`) but they're less flexible than JS spread. +- **Impact**: Functions like `jm`/`jt` need different idioms. + +### 7. Primitive vs Object Types + +- **Issue**: Java distinguishes `int`/`Integer`, `boolean`/`Boolean`, etc. JSON deserialization typically uses boxed types. +- **Impact**: Type checking must handle both primitive and boxed types. + +--- + +## Test Coverage + +**Almost no functional tests exist.** The test runner framework is partially built but only `isNode` is wired up as a test subject. This is the most critical gap. + +--- + +## Alignment Plan + +### Phase 1: Foundation (Critical) +1. Define `UNDEF` sentinel object for undefined/absent distinction +2. Define `SKIP` and `DELETE` sentinel objects +3. Add all type constants (`T_any`, `T_noval`, `T_boolean`, etc.) +4. Fix `isFunc` to check for `java.util.function.Function` or custom functional interface +5. Fix `keysof` to return string indices for lists (not zeros) +6. Fix `hasKey` to distinguish null values from absent keys +7. Add `strkey`, `typify`, `typename` functions + +### Phase 2: Missing Minor Functions +8. Add `getelem(val, key, alt)` with negative index support +9. Add `getdef(val, alt)` helper +10. Add `delprop(parent, key)` function +11. Add `size(val)` function +12. Add `slice(val, start, end)` function +13. Add `flatten(list, depth)` function +14. Add `filter(val, check)` function +15. Add `pad(str, padding, padchar)` function +16. Add `join(arr, sep, url)` function +17. Add `replace(s, from, to)` function +18. Add `jsonify(val, flags)` function +19. Add `jm(...kv)` and `jt(...v)` JSON builders + +### Phase 3: Core Operations (Critical) +20. Implement `Injection` class with `descend()`, `child()`, `setval()` methods +21. Implement `getpath(store, path, injdef)` function +22. Implement `setpath(store, path, val, injdef)` function +23. Implement `merge(val, maxdepth)` function +24. Implement `inject(val, store, injdef)` function with full injection system +25. Implement `transform(data, spec, injdef)` with all transform commands +26. Implement `validate(data, spec, injdef)` with all validators +27. Implement `select(children, query)` with all operators + +### Phase 4: Fix Existing Functions +28. Fix `walk` to support `before`/`after` callbacks and `maxdepth` +29. Fix `stringify` to produce output matching TS format +30. Fix `clone` to handle all JSON-like types correctly +31. Fix `escapeRegex` to match TS escaping behavior (not `Pattern.quote`) +32. Fix `pathify` default index parameter +33. Fix `items` to return string keys consistently + +### Phase 5: Test Infrastructure +34. Wire up `TestSubject.invoke()` for all functions +35. Complete test runner to execute full `test.json` spec +36. Add all test categories matching TS test suite +37. Ensure all tests pass against shared `test.json` + +### Phase 6: Advanced Features +38. Add `checkPlacement`, `injectorArgs`, `injectChild` functions +39. Add custom validator/transform extension support via `injdef.extra` diff --git a/java/src/Struct.java b/java/src/Struct.java index ac926ca1..0e6a8960 100644 --- a/java/src/Struct.java +++ b/java/src/Struct.java @@ -50,21 +50,60 @@ public static class S { public static final String DTOP = "$TOP"; public static final String DERRS = "$ERRS"; public static final String DMETA = "`$META`"; + public static final String ANY = "any"; public static final String ARRAY = "array"; public static final String BASE = "base"; public static final String BOOLEAN = "boolean"; + public static final String DECIMAL = "decimal"; public static final String EMPTY = ""; public static final String FUNCTION = "function"; + public static final String INSTANCE = "instance"; + public static final String INTEGER = "integer"; + public static final String LIST = "list"; + public static final String MAP = "map"; + public static final String NIL = "nil"; + public static final String NODE = "node"; + public static final String NULL = "null"; public static final String NUMBER = "number"; public static final String OBJECT = "object"; + public static final String SCALAR = "scalar"; public static final String STRING = "string"; + public static final String SYMBOL = "symbol"; public static final String KEY = "key"; public static final String PARENT = "parent"; public static final String BT = "`"; public static final String DS = "$"; public static final String DT = "."; + public static final String SP = " "; + public static final String VIZ = ": "; public static final String KEY_NAME = "KEY"; } + + // Type constants - bitfield integers matching TypeScript canonical. + public static final int T_any = (1 << 31) - 1; + public static final int T_noval = 1 << 30; + public static final int T_boolean = 1 << 29; + public static final int T_decimal = 1 << 28; + public static final int T_integer = 1 << 27; + public static final int T_number = 1 << 26; + public static final int T_string = 1 << 25; + public static final int T_function = 1 << 24; + public static final int T_symbol = 1 << 23; + public static final int T_null = 1 << 22; + public static final int T_list = 1 << 14; + public static final int T_map = 1 << 13; + public static final int T_instance = 1 << 12; + public static final int T_scalar = 1 << 7; + public static final int T_node = 1 << 6; + + private static final String[] TYPENAME = { + S.ANY, S.NIL, S.BOOLEAN, S.DECIMAL, S.INTEGER, S.NUMBER, S.STRING, + S.FUNCTION, S.SYMBOL, S.NULL, + "", "", "", "", "", "", "", + S.LIST, S.MAP, S.INSTANCE, + "", "", "", "", + S.SCALAR, S.NODE, + }; @FunctionalInterface public interface WalkApply { @@ -101,7 +140,62 @@ public static boolean isEmpty(Object val) { } public static boolean isFunc(Object val) { - return val instanceof Runnable; + return val instanceof Runnable || val instanceof Function; + } + + // Get type name string from type bitfield value. + public static String typename(int t) { + String tname = ""; + for (int tI = 0; tI < TYPENAME.length; tI++) { + if (!TYPENAME[tI].isEmpty() && 0 < (t & (1 << (31 - tI)))) { + tname = TYPENAME[tI]; + } + } + return tname; + } + + // Determine the type of a value as a bitfield integer. + public static int typify(Object value) { + if (value == null) { + return T_noval; + } + + if (value instanceof Boolean) { + return T_scalar | T_boolean; + } + + if (value instanceof Integer || value instanceof Long) { + return T_scalar | T_number | T_integer; + } + + if (value instanceof Float || value instanceof Double) { + double d = ((Number) value).doubleValue(); + if (Double.isNaN(d)) { + return T_noval; + } + if (d == Math.floor(d) && !Double.isInfinite(d)) { + return T_scalar | T_number | T_integer; + } + return T_scalar | T_number | T_decimal; + } + + if (value instanceof String) { + return T_scalar | T_string; + } + + if (value instanceof Runnable || value instanceof Function) { + return T_scalar | T_function; + } + + if (value instanceof List) { + return T_node | T_list; + } + + if (value instanceof Map) { + return T_node | T_map; + } + + return T_any; } @SuppressWarnings("unchecked") diff --git a/js/NOTES.md b/js/NOTES.md new file mode 100644 index 00000000..02d796d6 --- /dev/null +++ b/js/NOTES.md @@ -0,0 +1,17 @@ +# JavaScript Implementation Notes + +## undefined vs null + +JavaScript natively distinguishes `undefined` from `null`. In this library: +- `undefined` means **property absence** (the key does not exist, or no value was provided). +- `null` represents **JSON null** (an explicit null value in the data). + +TypeScript tests relating to `undefined` test property absence behavior. Since JavaScript +shares this semantics, no special handling is needed — the language natively supports this distinction. + +## Type System + +This implementation uses bitfield integers for the type system, matching the TypeScript canonical. +Type constants (`T_any`, `T_noval`, `T_boolean`, etc.) are exported and `typify()` returns +integer bitfields. Use `typename()` to get the human-readable name for error messages. +Bitwise operations allow composite type checks (e.g., `T_scalar | T_string`). diff --git a/js/REVIEW.md b/js/REVIEW.md new file mode 100644 index 00000000..8097ccb9 --- /dev/null +++ b/js/REVIEW.md @@ -0,0 +1,181 @@ +# JavaScript (js) - Review vs TypeScript Canonical + +## Overview + +The JavaScript version is significantly behind the TypeScript canonical version. It exports **27 functions** compared to TypeScript's **40+**, and uses an older API design pattern (separate positional parameters instead of the unified `injdef` object pattern). + +--- + +## Missing Functions + +The following functions present in the TypeScript canonical are **completely absent** from the JS version: + +| Function | Category | Impact | +|----------|----------|--------| +| `delprop` | Property access | No way to delete properties cleanly | +| `getelem` | Property access | No negative-index list element access | +| `getdef` | Property access | No defined-or-default helper | +| `setpath` | Path operations | Cannot set values at nested paths | +| `select` | Query operations | No MongoDB-style query/filter on children | +| `size` | Collection | No unified size/length function | +| `slice` | Collection | No array/string slicing with negative indices | +| `flatten` | Collection | No nested array flattening | +| `filter` | Collection | No predicate-based filtering | +| `pad` | String | No string padding utility | +| `replace` | String | No unified string replace | +| `join` | String | No general join (only `joinurl`) | +| `jsonify` | Serialization | No JSON serialization with formatting | +| `typename` | Type system | No type-name-from-bitfield function | +| `jm` | JSON builders | No map builder | +| `jt` | JSON builders | No array/tuple builder | +| `checkPlacement` | Advanced | No placement validation for injectors | +| `injectorArgs` | Advanced | No injector argument validation | +| `injectChild` | Advanced | No child injection helper | + +--- + +## API Signature Differences + +### 1. `typify` returns strings instead of bitfield integers + +- **TS**: Returns a numeric bitfield (e.g., `T_string`, `T_integer | T_number`). Enables bitwise type composition and checking. +- **JS**: Returns a simple string (`'null'`, `'string'`, `'number'`, `'boolean'`, `'function'`, `'array'`, `'object'`). +- **Impact**: The entire bitfield-based type system (`T_any`, `T_noval`, `T_boolean`, `T_decimal`, `T_integer`, `T_number`, `T_string`, `T_function`, `T_symbol`, `T_null`, `T_list`, `T_map`, `T_instance`, `T_scalar`, `T_node`) is missing. This prevents fine-grained type discrimination (e.g., distinguishing `integer` from `decimal`). + +### 2. `walk` has a different signature + +- **TS**: `walk(val, before?, after?, maxdepth?, key?, parent?, path?)` - supports separate `before` and `after` callbacks, and a `maxdepth` limit. +- **JS**: `walk(val, apply, key?, parent?, path?)` - single `apply` callback (post-order only), no `maxdepth`. +- **Impact**: Cannot apply transformations before descending into children; no depth protection against deeply nested structures. + +### 3. `inject` uses positional parameters instead of `injdef` + +- **TS**: `inject(val, store, injdef?)` where `injdef` is a `Partial` with `modify`, `handler`, `extra`, `meta`, `errs` fields. +- **JS**: `inject(val, store, modify?, current?, state?)` - separate positional parameters. +- **Impact**: Less extensible; adding new options requires changing the function signature. + +### 4. `transform` uses positional parameters instead of `injdef` + +- **TS**: `transform(data, spec, injdef?)` - unified injection definition. +- **JS**: `transform(data, spec, extra?, modify?)` - separate params. +- **Impact**: Same extensibility concern as `inject`. + +### 5. `validate` uses positional parameters instead of `injdef` + +- **TS**: `validate(data, spec, injdef?)` - unified injection definition. +- **JS**: `validate(data, spec, extra?, collecterrs?)` - separate params. + +### 6. `getpath` parameter order differs + +- **TS**: `getpath(store, path, injdef?)` - store first. +- **JS**: `getpath(path, store, current?, state?)` - path first. +- **Impact**: Inconsistent with the rest of the TS API. + +### 7. `joinurl` is a standalone function + +- **TS**: Uses `join(arr, sep?, url?)` with a `url` parameter for URL mode. +- **JS**: Has a separate `joinurl(sarr)` function; no general `join`. +- **Impact**: Less unified API. + +### 8. `setprop` deletion behavior differs + +- **TS**: Has separate `delprop` function; `setprop` with `DELETE` marker deletes. +- **JS**: `setprop` with `undefined` value deletes the property. +- **Impact**: Conflates "set to undefined" with "delete". + +--- + +## Validation Differences + +- **TS**: Uses `$MAP`, `$LIST`, `$INTEGER`, `$DECIMAL`, `$NIL`, `$INSTANCE` validators. +- **JS**: Uses `$OBJECT`, `$ARRAY` (no `$MAP`/`$LIST` aliases). Missing `$INTEGER`, `$DECIMAL`, `$NIL`, `$INSTANCE` validators. +- **Impact**: Less granular validation; cannot distinguish integer from decimal numbers. + +--- + +## Transform Differences + +- **TS**: Supports `$ANNO`, `$FORMAT`, `$APPLY`, `$REF`, `$BT`, `$DS`, `$WHEN` transform commands. +- **JS**: Missing `$ANNO`, `$FORMAT`, `$APPLY`, `$REF` (some may be partially present). Has `$BT`, `$DS`, `$WHEN`. +- **Impact**: Fewer transformation capabilities. + +--- + +## Structural/Architectural Differences + +### No Injection Class +- **TS**: Has a full `Injection` class with methods (`descend()`, `child()`, `setval()`, `toString()`). +- **JS**: Uses plain objects for state management. +- **Impact**: Less structured state management; harder to debug injection processing. + +### No Type Constants +- **TS**: Exports `T_any`, `T_noval`, `T_boolean`, `T_decimal`, `T_integer`, etc. as bitfield constants. +- **JS**: No type constants at all. + +### No SKIP/DELETE Sentinels +- **TS**: Exports `SKIP` and `DELETE` sentinel objects. +- **JS**: Not exported (may be used internally). + +### No `merge` maxdepth parameter +- **TS**: `merge(val, maxdepth?)` supports depth limiting. +- **JS**: `merge(val)` has no depth limit. + +--- + +## Significant Language Difference Issues + +1. **No issues** - JavaScript and TypeScript share the same runtime semantics, so there are no fundamental language barriers. All differences are implementation gaps. + +--- + +## Test Coverage Gaps + +Tests missing for: `setpath`, `select`, `size`, `slice`, `flatten`, `filter`, `pad`, `jsonify`, `delprop`, `getelem`, `typename`, `jm`, `jt`, `walk-depth`, `walk-copy`, `merge-depth`, `getpath-special`, `getpath-handler`, `transform-ref`, `transform-format`, `transform-apply`, `validate-special`, `validate-edge`, `select-*`. + +--- + +## Alignment Plan + +### Phase 1: Core Missing Functions (High Priority) +1. Add `delprop(parent, key)` function +2. Add `getelem(val, key, alt)` with negative index support +3. Add `getdef(val, alt)` helper +4. Add `setpath(store, path, val, injdef)` function +5. Add `size(val)` function +6. Add `select(children, query)` with operator support + +### Phase 2: Type System Alignment +7. Convert `typify` to return bitfield integers matching TS constants +8. Add all type constants (`T_any`, `T_noval`, `T_boolean`, etc.) +9. Add `typename(t)` function +10. Export `SKIP` and `DELETE` sentinels + +### Phase 3: Collection Functions +11. Add `slice(val, start, end, mutate)` function +12. Add `flatten(list, depth)` function +13. Add `filter(val, check)` function +14. Add `pad(str, padding, padchar)` function +15. Add `replace(s, from, to)` function +16. Add `join(arr, sep, url)` (general join, deprecate standalone `joinurl`) +17. Add `jsonify(val, flags)` function +18. Add `jm(...kv)` and `jt(...v)` JSON builders + +### Phase 4: API Signature Alignment +19. Refactor `walk` to support `before`/`after` callbacks and `maxdepth` +20. Refactor `inject` to use `injdef` object parameter +21. Refactor `transform` to use `injdef` object parameter +22. Refactor `validate` to use `injdef` object parameter +23. Align `getpath` parameter order to `(store, path, injdef)` +24. Add `merge` `maxdepth` parameter + +### Phase 5: Injection System +25. Create `Injection` class with `descend()`, `child()`, `setval()` methods +26. Add `checkPlacement`, `injectorArgs`, `injectChild` functions + +### Phase 6: Validation/Transform Parity +27. Add `$MAP`, `$LIST`, `$INTEGER`, `$DECIMAL`, `$NIL`, `$INSTANCE` validators +28. Add `$ANNO`, `$FORMAT`, `$APPLY`, `$REF` transform commands + +### Phase 7: Test Alignment +29. Add tests for all new functions using shared `test.json` spec +30. Ensure all test categories from TS are passing diff --git a/js/src/struct.js b/js/src/struct.js index 769b09fa..19a70c0e 100644 --- a/js/src/struct.js +++ b/js/src/struct.js @@ -1,5 +1,5 @@ -/* Copyright (c) 2025 Voxgig Ltd. MIT LICENSE. */ - +/* Copyright (c) 2025-2026 Voxgig Ltd. MIT LICENSE. */ +// VERSION: @voxgig/struct 0.0.10 /* Voxgig Struct * ============= * @@ -31,13 +31,15 @@ * - stringify: human-friendly string version of a value. * - escre: escape a regular expresion string. * - escurl: escape a url. - * - joinurl: join parts of a url, merging forward slashes. + * - join: join parts of a url, merging forward slashes. * * This set of functions and supporting utilities is designed to work * uniformly across many languages, meaning that some code that may be * functionally redundant in specific languages is still retained to * keep the code human comparable. * + * NOTE: Lists are assumed to be mutable and reference stable. + * * NOTE: In this code JSON nulls are in general *not* considered the * same as the undefined value in the given language. However most * JSON parsers do use the undefined value to represent JSON @@ -48,1353 +50,1675 @@ * the unit tests use the string "__NULL__" where necessary. * */ - - // String constants are explicitly defined. - -// Mode value for inject step. -const S_MKEYPRE = 'key:pre' -const S_MKEYPOST = 'key:post' -const S_MVAL = 'val' -const S_MKEY = 'key' - -// Special keys. -const S_DKEY = '`$KEY`' -const S_DMETA = '`$META`' +// Mode value for inject step (bitfield). +const M_KEYPRE = 1 +const M_KEYPOST = 2 +const M_VAL = 4 +// Special strings. +const S_BKEY = '`$KEY`' +const S_BANNO = '`$ANNO`' +const S_BEXACT = '`$EXACT`' +const S_BVAL = '`$VAL`' +const S_DKEY = '$KEY' const S_DTOP = '$TOP' const S_DERRS = '$ERRS' - +const S_DSPEC = '$SPEC' // General strings. -const S_array = 'array' +const S_list = 'list' const S_base = 'base' const S_boolean = 'boolean' const S_function = 'function' +const S_symbol = 'symbol' +const S_instance = 'instance' +const S_key = 'key' +const S_any = 'any' +const S_nil = 'nil' +const S_null = 'null' const S_number = 'number' const S_object = 'object' const S_string = 'string' -const S_null = 'null' -const S_MT = '' +const S_decimal = 'decimal' +const S_integer = 'integer' +const S_map = 'map' +const S_scalar = 'scalar' +const S_node = 'node' +// Character strings. const S_BT = '`' +const S_CN = ':' +const S_CS = ']' const S_DS = '$' const S_DT = '.' -const S_CN = ':' +const S_FS = '/' const S_KEY = 'KEY' - - +const S_MT = '' +const S_OS = '[' +const S_SP = ' ' +const S_CM = ',' +const S_VIZ = ': ' +// Types +let t = 31 +const T_any = (1 << t--) - 1 +const T_noval = 1 << t--; // Means property absent, undefined. Also NOT a scalar! +const T_boolean = 1 << t-- +const T_decimal = 1 << t-- +const T_integer = 1 << t-- +const T_number = 1 << t-- +const T_string = 1 << t-- +const T_function = 1 << t-- +const T_symbol = 1 << t-- +const T_null = 1 << t--; // The actual JSON null value. +t -= 7 +const T_list = 1 << t-- +const T_map = 1 << t-- +const T_instance = 1 << t-- +t -= 4 +const T_scalar = 1 << t-- +const T_node = 1 << t-- +const TYPENAME = [ + S_any, + S_nil, + S_boolean, + S_decimal, + S_integer, + S_number, + S_string, + S_function, + S_symbol, + S_null, + '', '', '', + '', '', '', '', + S_list, + S_map, + S_instance, + '', '', '', '', + S_scalar, + S_node, +] // The standard undefined value for this language. -const UNDEF = undefined - - +const NONE = undefined +// Private markers +const SKIP = { '`$SKIP`': true } +const DELETE = { '`$DELETE`': true } +// Regular expression constants +const R_INTEGER_KEY = /^[-0-9]+$/; // Match integer keys (including <0). +const R_ESCAPE_REGEXP = /[.*+?^${}()|[\]\\]/g; // Chars that need escaping in regexp. +const R_TRAILING_SLASH = /\/+$/; // Trailing slashes in URLs. +const R_LEADING_TRAILING_SLASH = /([^\/])\/+/; // Multiple slashes in URL middle. +const R_LEADING_SLASH = /^\/+/; // Leading slashes in URLs. +const R_QUOTES = /"/g; // Double quotes for removal. +const R_DOT = /\./g; // Dots in path strings. +const R_CLONE_REF = /^`\$REF:([0-9]+)`$/; // Copy reference in cloning. +const R_META_PATH = /^([^$]+)\$([=~])(.+)$/; // Meta path syntax. +const R_DOUBLE_DOLLAR = /\$\$/g; // Double dollar escape sequence. +const R_TRANSFORM_NAME = /`\$([A-Z]+)`/g; // Transform command names. +const R_INJECTION_FULL = /^`(\$[A-Z]+|[^`]*)[0-9]*`$/; // Full string injection pattern. +const R_BT_ESCAPE = /\$BT/g; // Backtick escape sequence. +const R_DS_ESCAPE = /\$DS/g; // Dollar sign escape sequence. +const R_INJECTION_PARTIAL = /`([^`]+)`/g; // Partial string injection pattern. +// Default max depth (for walk etc). +const MAXDEPTH = 32 +// Return type string for narrowest type. +function typename(t) { + return getelem(TYPENAME, Math.clz32(t), TYPENAME[0]) +} +// Get a defined value. Returns alt if val is undefined. +function getdef(val, alt) { + if (NONE === val) { + return alt + } + return val +} // Value is a node - defined, and a map (hash) or list (array). -// NOTE: javascript -// stuff +// NOTE: typescript +// things function isnode(val) { - return null != val && S_object == typeof val + return null != val && S_object == typeof val } - - // Value is a defined map (hash) with string keys. function ismap(val) { - return null != val && S_object == typeof val && !Array.isArray(val) + return null != val && S_object == typeof val && !Array.isArray(val) } - - // Value is a defined list (array) with integer keys (indexes). function islist(val) { - return Array.isArray(val) + return Array.isArray(val) } - - // Value is a defined string (non-empty) or integer key. function iskey(key) { - const keytype = typeof key - return (S_string === keytype && S_MT !== key) || S_number === keytype + const keytype = typeof key + return (S_string === keytype && S_MT !== key) || S_number === keytype } - - // Check for an "empty" value - undefined, empty string, array, object. function isempty(val) { - return null == val || S_MT === val || - (Array.isArray(val) && 0 === val.length) || - (S_object === typeof val && 0 === Object.keys(val).length) + return null == val || S_MT === val || + (Array.isArray(val) && 0 === val.length) || + (S_object === typeof val && 0 === Object.keys(val).length) } - - // Value is a function. function isfunc(val) { - return S_function === typeof val + return S_function === typeof val } - - -// Determine the type of a value as a string. -// Returns one of: 'null', 'string', 'number', 'boolean', 'function', 'array', 'object' -// Normalizes and simplifies JavaScript's type system for consistency. +// The integer size of the value. For arrays and strings, the length, +// for numbers, the integer part, for boolean, true is 1 and falso 0, for all other values, 0. +function size(val) { + if (islist(val)) { + return val.length + } + else if (ismap(val)) { + return Object.keys(val).length + } + const valtype = typeof val + if (S_string == valtype) { + return val.length + } + else if (S_number == typeof val) { + return Math.floor(val) + } + else if (S_boolean == typeof val) { + return true === val ? 1 : 0 + } + else { + return 0 + } +} +// Extract part of an array or string into a new value, from the start +// point to the end point. If no end is specified, extract to the +// full length of the value. Negative arguments count from the end of +// the value. For numbers, perform min and max bounding, where start +// is inclusive, and end is *exclusive*. +// NOTE: input lists are not mutated by default. Use the mutate +// argument to mutate lists in place. +function slice(val, start, end, mutate) { + if (S_number === typeof val) { + start = null == start || S_number !== typeof start ? Number.MIN_SAFE_INTEGER : start + end = (null == end || S_number !== typeof end ? Number.MAX_SAFE_INTEGER : end) - 1 + return Math.min(Math.max(val, start), end) + } + const vlen = size(val) + if (null != end && null == start) { + start = 0 + } + if (null != start) { + if (start < 0) { + end = vlen + start + if (end < 0) { + end = 0 + } + start = 0 + } + else if (null != end) { + if (end < 0) { + end = vlen + end + if (end < 0) { + end = 0 + } + } + else if (vlen < end) { + end = vlen + } + } + else { + end = vlen + } + if (vlen < start) { + start = vlen + } + if (-1 < start && start <= end && end <= vlen) { + if (islist(val)) { + if (mutate) { + for (let i = 0, j = start; j < end; i++, j++) { + val[i] = val[j] + } + val.length = (end - start) + } + else { + val = val.slice(start, end) + } + } + else if (S_string === typeof val) { + val = val.substring(start, end) + } + } + else { + if (islist(val)) { + val = [] + } + else if (S_string === typeof val) { + val = S_MT + } + } + } + return val +} +// String padding. +function pad(str, padding, padchar) { + str = S_string === typeof str ? str : stringify(str) + padding = null == padding ? 44 : padding + padchar = null == padchar ? S_SP : ((padchar + S_SP)[0]) + return -1 < padding ? str.padEnd(padding, padchar) : str.padStart(0 - padding, padchar) +} +// Determine the type of a value as a bit code. function typify(value) { - if (value === null || value === undefined) { - return S_null - } - - const type = typeof value - - if (Array.isArray(value)) { - return S_array - } - - if (type === 'object') { - return S_object - } - - return type + if (undefined === value) { + return T_noval + } + const typestr = typeof value + if (null === value) { + return T_scalar | T_null + } + else if (S_number === typestr) { + if (Number.isInteger(value)) { + return T_scalar | T_number | T_integer + } + else if (isNaN(value)) { + return T_noval + } + else { + return T_scalar | T_number | T_decimal + } + } + else if (S_string === typestr) { + return T_scalar | T_string + } + else if (S_boolean === typestr) { + return T_scalar | T_boolean + } + else if (S_function === typestr) { + return T_scalar | T_function + } + // For languages that have symbolic atoms. + else if (S_symbol === typestr) { + return T_scalar | T_symbol + } + else if (Array.isArray(value)) { + return T_node | T_list + } + else if (S_object === typestr) { + if (value.constructor instanceof Function) { + let cname = value.constructor.name + if ('Object' !== cname && 'Array' !== cname) { + return T_node | T_instance + } + } + return T_node | T_map + } + // Anything else (e.g. bigint) is considered T_any + return T_any +} +// Get a list element. The key should be an integer, or a string +// that can parse to an integer only. Negative integers count from the end of the list. +function getelem(val, key, alt) { + let out = NONE + if (NONE === val || NONE === key) { + return alt + } + if (islist(val)) { + let nkey = parseInt(key) + if (Number.isInteger(nkey) && ('' + key).match(R_INTEGER_KEY)) { + if (nkey < 0) { + key = val.length + nkey + } + out = val[key] + } + } + if (NONE === out) { + return 0 < (T_function & typify(alt)) ? alt() : alt + } + return out } - - // Safely get a property of a node. Undefined arguments return undefined. // If the key is not found, return the alternative value, if any. function getprop(val, key, alt) { - let out = alt - - if (UNDEF === val || UNDEF === key) { - return alt - } - - if (isnode(val)) { - out = val[key] - } - - if (UNDEF === out) { - return alt - } - - return out + let out = alt + if (NONE === val || NONE === key) { + return alt + } + if (isnode(val)) { + out = val[key] + } + if (NONE === out) { + return alt + } + return out } - - // Convert different types of keys to string representation. // String keys are returned as is. // Number keys are converted to strings. // Floats are truncated to integers. // Booleans, objects, arrays, null, undefined all return empty string. -function strkey(key = UNDEF) { - if (UNDEF === key) { - return S_MT - } - - if (typeof key === S_string) { - return key - } - - if (typeof key === S_boolean) { +function strkey(key = NONE) { + if (NONE === key) { + return S_MT + } + const t = typify(key) + if (0 < (T_string & t)) { + return key + } + else if (0 < (T_boolean & t)) { + return S_MT + } + else if (0 < (T_number & t)) { + return key % 1 === 0 ? String(key) : String(Math.floor(key)) + } return S_MT - } - - if (typeof key === S_number) { - return key % 1 === 0 ? String(key) : String(Math.floor(key)) - } - - return S_MT } - - -// Sorted keys of a map, or indexes of a list. +// Sorted keys of a map, or indexes (as strings) of a list. +// Root utility - only uses language facilities. function keysof(val) { - return !isnode(val) ? [] : - ismap(val) ? Object.keys(val).sort() : val.map((_n, i) => '' + i) + return !isnode(val) ? [] : + ismap(val) ? Object.keys(val).sort() : val.map((_n, i) => S_MT + i) } - - // Value of property with name key in node val is defined. +// Root utility - only uses language facilities. function haskey(val, key) { - return UNDEF !== getprop(val, key) + return NONE !== getprop(val, key) } - - -// List the sorted keys of a map or list as an array of tuples of the form [key, value]. -// NOTE: Unlike keysof, list indexes are returned as numbers. -function items(val) { - return keysof(val).map(k => [k, val[k]]) +function items(val, apply) { + let out = keysof(val).map((k) => [k, val[k]]) + if (null != apply) { + out = out.map(apply) + } + return out +} +// To replicate the array spread operator: +// a=1, b=[2,3], c=[4,5] +// [a,...b,c] -> [1,2,3,[4,5]] +// flatten([a,b,[c]]) -> [1,2,3,[4,5]] +// NOTE: [c] ensures c is not expanded +function flatten(list, depth) { + if (!islist(list)) { + return list + } + return list.flat(getdef(depth, 1)) +} +// Filter item values using check function. +function filter(val, check) { + let all = items(val) + let numall = size(all) + let out = [] + for (let i = 0; i < numall; i++) { + if (check(all[i])) { + out.push(all[i][1]) + } + } + return out } - - // Escape regular expression. function escre(s) { - s = null == s ? S_MT : s - return s.replace(/[.*+?^${}()|[\]\\]/g, '\\$&') + // s = null == s ? S_MT : s + return replace(s, R_ESCAPE_REGEXP, '\\$&') } - - // Escape URLs. function escurl(s) { - s = null == s ? S_MT : s - return encodeURIComponent(s) + s = null == s ? S_MT : s + return encodeURIComponent(s) } - - -// Concatenate url part strings, merging forward slashes as needed. -function joinurl(sarr) { - return sarr - .filter(s => null != s && '' !== s) - .map((s, i) => 0 === i ? s.replace(/\/+$/, '') : - s.replace(/([^\/])\/+/, '$1/').replace(/^\/+/, '').replace(/\/+$/, '')) - .filter(s => '' !== s) - .join('/') +// Replace a search string (all), or a regexp, in a source string. +function replace(s, from, to) { + let rs = s + let ts = typify(s) + if (0 === (T_string & ts)) { + rs = stringify(s) + } + else if (0 < ((T_noval | T_null) & ts)) { + rs = S_MT + } + else { + rs = stringify(s) + } + return rs.replace(from, to) } - - -// Safely stringify a value for humans (NOT JSON!). -function stringify(val, maxlen) { - let str = S_MT - - if (UNDEF === val) { +// Concatenate url part strings, merging sep char as needed. +function join(arr, sep, url) { + const sarr = size(arr) + const sepdef = getdef(sep, S_CM) + const sepre = 1 === size(sepdef) ? escre(sepdef) : NONE + const out = filter(items( + // filter(arr, (n) => null != n[1] && S_MT !== n[1]), + filter(arr, (n) => (0 < (T_string & typify(n[1]))) && S_MT !== n[1]), (n) => { + let i = +n[0] + let s = n[1] + if (NONE !== sepre && S_MT !== sepre) { + if (url && 0 === i) { + s = replace(s, RegExp(sepre + '+$'), S_MT) + return s + } + if (0 < i) { + s = replace(s, RegExp('^' + sepre + '+'), S_MT) + } + if (i < sarr - 1 || !url) { + s = replace(s, RegExp(sepre + '+$'), S_MT) + } + s = replace(s, RegExp('([^' + sepre + '])' + sepre + '+([^' + sepre + '])'), '$1' + sepdef + '$2') + } + return s + }), (n) => S_MT !== n[1]) + .join(sepdef) + return out +} +// Output JSON in a "standard" format, with 2 space indents, each property on a new line, +// and spaces after {[: and before ]}. Any "wierd" values (NaN, etc) are output as null. +// In general, the behaivor of of JavaScript's JSON.stringify(val,null,2) is followed. +function jsonify(val, flags) { + let str = S_null + if (null != val) { + try { + const indent = getprop(flags, 'indent', 2) + str = JSON.stringify(val, null, indent) + if (NONE === str) { + str = S_null + } + const offset = getprop(flags, 'offset', 0) + if (0 < offset) { + // Left offset entire indented JSON so that it aligns with surrounding code + // indented by offset. Assume first brace is on line with asignment, so not offset. + str = '{\n' + + join(items(slice(str.split('\n'), 1), (n) => pad(n[1], 0 - offset - size(n[1]))), '\n') + } + } + catch (e) { + str = '__JSONIFY_FAILED__' + } + } return str - } - - try { - str = JSON.stringify(val, function(_key, val) { - if ( - val !== null && - typeof val === "object" && - !Array.isArray(val) - ) { - const sortedObj = {} - for (const k of Object.keys(val).sort()) { - sortedObj[k] = val[k] - } - return sortedObj - } - return val - }) - } - catch (err) { - str = S_MT + val - } - - str = S_string !== typeof str ? S_MT + str : str - str = str.replace(/"/g, '') - - if (null != maxlen) { - let js = str.substring(0, maxlen) - str = maxlen < str.length ? (js.substring(0, maxlen - 3) + '...') : str - } - - return str } - - +// Safely stringify a value for humans (NOT JSON!). +function stringify(val, maxlen, pretty) { + let valstr = S_MT + pretty = !!pretty + if (NONE === val) { + return pretty ? '<>' : valstr + } + if (S_string === typeof val) { + valstr = val + } + else { + try { + valstr = JSON.stringify(val, function (_key, val) { + if (val !== null && + typeof val === "object" && + !Array.isArray(val)) { + const sortedObj = {} + items(val, (n) => { + sortedObj[n[0]] = val[n[0]] + }) + return sortedObj + } + return val + }) + valstr = valstr.replace(R_QUOTES, S_MT) + } + catch (err) { + valstr = '__STRINGIFY_FAILED__' + } + } + if (null != maxlen && -1 < maxlen) { + let js = valstr.substring(0, maxlen) + valstr = maxlen < valstr.length ? (js.substring(0, maxlen - 3) + '...') : valstr + } + if (pretty) { + // Indicate deeper JSON levels with different terminal colors (simplistic wrt strings). + let c = items([81, 118, 213, 39, 208, 201, 45, 190, 129, 51, 160, 121, 226, 33, 207, 69], (n) => '\x1b[38;5;' + n[1] + 'm'), r = '\x1b[0m', d = 0, o = c[0], t = o + for (const ch of valstr) { + if (ch === '{' || ch === '[') { + d++ + o = c[d % c.length] + t += o + ch + } + else if (ch === '}' || ch === ']') { + t += o + ch + d-- + o = c[d % c.length] + } + else { + t += o + ch + } + } + return t + r + } + return valstr +} // Build a human friendly path string. function pathify(val, startin, endin) { - let pathstr = UNDEF - - let path = islist(val) ? val : - S_string == typeof val ? [val] : - S_number == typeof val ? [val] : - UNDEF - - const start = null == startin ? 0 : -1 < startin ? startin : 0 - const end = null == endin ? 0 : -1 < endin ? endin : 0 - - if (UNDEF != path && 0 <= start) { - path = path.slice(start, path.length - end) - if (0 === path.length) { - pathstr = '' + let pathstr = NONE + let path = islist(val) ? val : + S_string == typeof val ? [val] : + S_number == typeof val ? [val] : + NONE + const start = null == startin ? 0 : -1 < startin ? startin : 0 + const end = null == endin ? 0 : -1 < endin ? endin : 0 + if (NONE != path && 0 <= start) { + path = slice(path, start, path.length - end) + if (0 === path.length) { + pathstr = '' + } + else { + pathstr = join(items(filter(path, (n) => iskey(n[1])), (n) => { + let p = n[1] + return S_number === typeof p ? S_MT + Math.floor(p) : + p.replace(R_DOT, S_MT) + }), S_DT) + } } - else { - pathstr = path - // .filter((p, t) => (t = typeof p, S_string === t || S_number === t)) - .filter((p) => iskey(p)) - .map((p) => - 'number' === typeof p ? S_MT + Math.floor(p) : - p.replace(/\./g, S_MT)) - .join(S_DT) - } - } - - if (UNDEF === pathstr) { - pathstr = '' - } - - return pathstr + if (NONE === pathstr) { + pathstr = '' + } + return pathstr } - - // Clone a JSON-like data structure. -// NOTE: function value references are copied, *not* cloned. +// NOTE: function and instance values are copied, *not* cloned. function clone(val) { - const refs = [] - const replacer = (_k, v) => S_function === typeof v ? - (refs.push(v), '`$FUNCTION:' + (refs.length - 1) + '`') : v - const reviver = (_k, v, m) => S_string === typeof v ? - (m = v.match(/^`\$FUNCTION:([0-9]+)`$/), m ? refs[m[1]] : v) : v - return UNDEF === val ? UNDEF : JSON.parse(JSON.stringify(val, replacer), reviver) + const refs = [] + const reftype = T_function | T_instance + const replacer = (_k, v) => 0 < (reftype & typify(v)) ? + (refs.push(v), '`$REF:' + (refs.length - 1) + '`') : v + const reviver = (_k, v, m) => S_string === typeof v ? + (m = v.match(R_CLONE_REF), m ? refs[m[1]] : v) : v + const out = NONE === val ? NONE : JSON.parse(JSON.stringify(val, replacer), reviver) + return out +} +// Define a JSON Object using function arguments. +function jm(...kv) { + const kvsize = size(kv) + const o = {} + for (let i = 0; i < kvsize; i += 2) { + let k = getprop(kv, i, '$KEY' + i) + k = 'string' === typeof k ? k : stringify(k) + o[k] = getprop(kv, i + 1, null) + } + return o +} +// Define a JSON Array using function arguments. +function jt(...v) { + const vsize = size(v) + const a = new Array(vsize) + for (let i = 0; i < vsize; i++) { + a[i] = getprop(v, i, null) + } + return a +} +// Safely delete a property from an object or array element. +// Undefined arguments and invalid keys are ignored. +// Returns the (possibly modified) parent. +// For objects, the property is deleted using the delete operator. +// For arrays, the element at the index is removed and remaining elements are shifted down. +// NOTE: parent list may be new list, thus update references. +function delprop(parent, key) { + if (!iskey(key)) { + return parent + } + if (ismap(parent)) { + key = strkey(key) + delete parent[key] + } + else if (islist(parent)) { + // Ensure key is an integer. + let keyI = +key + if (isNaN(keyI)) { + return parent + } + keyI = Math.floor(keyI) + // Delete list element at position keyI, shifting later elements down. + const psize = size(parent) + if (0 <= keyI && keyI < psize) { + for (let pI = keyI; pI < psize - 1; pI++) { + parent[pI] = parent[pI + 1] + } + parent.length = parent.length - 1 + } + } + return parent } - - // Safely set a property. Undefined arguments and invalid keys are ignored. // Returns the (possibly modified) parent. -// If the value is undefined the key will be deleted from the parent. // If the parent is a list, and the key is negative, prepend the value. // NOTE: If the key is above the list size, append the value; below, prepend. -// If the value is undefined, remove the list element at index key, and shift the -// remaining elements down. These rules avoid "holes" in the list. +// NOTE: parent list may be new list, thus update references. function setprop(parent, key, val) { - if (!iskey(key)) { - return parent - } - - if (ismap(parent)) { - key = S_MT + key - if (UNDEF === val) { - delete parent[key] - } - else { - parent[key] = val + if (!iskey(key)) { + return parent } - } - else if (islist(parent)) { - // Ensure key is an integer. - let keyI = +key - - if (isNaN(keyI)) { - return parent + if (ismap(parent)) { + key = S_MT + key + const pany = parent + pany[key] = val } - - keyI = Math.floor(keyI) - - // Delete list element at position keyI, shifting later elements down. - if (UNDEF === val) { - if (0 <= keyI && keyI < parent.length) { - for (let pI = keyI; pI < parent.length - 1; pI++) { - parent[pI] = parent[pI + 1] + else if (islist(parent)) { + // Ensure key is an integer. + let keyI = +key + if (isNaN(keyI)) { + return parent + } + keyI = Math.floor(keyI) + // TODO: DELETE list element + // Set or append value at position keyI, or append if keyI out of bounds. + if (0 <= keyI) { + parent[slice(keyI, 0, size(parent) + 1)] = val + } + // Prepend value if keyI is negative + else { + parent.unshift(val) } - parent.length = parent.length - 1 - } - } - - // Set or append value at position keyI, or append if keyI out of bounds. - else if (0 <= keyI) { - parent[parent.length < keyI ? parent.length : keyI] = val - } - - // Prepend value if keyI is negative - else { - parent.unshift(val) } - } - - return parent + return parent } - - // Walk a data structure depth first, applying a function to each value. function walk( - // These arguments are the public interface. - val, - apply, - - // These areguments are used for recursive state. - key, - parent, - path -) { - if (isnode(val)) { - for (let [ckey, child] of items(val)) { - setprop(val, ckey, walk(child, apply, ckey, val, [...(path || []), S_MT + ckey])) - } - } - - // Nodes are applied *after* their children. - // For the root node, key and parent will be undefined. - return apply(key, val, parent, path || []) +// These arguments are the public interface. +val, +// Before descending into a node. +before, +// After descending into a node. +after, +// Maximum recursive depth, default: 32. Use null for infinite depth. +maxdepth, +// These areguments are used for recursive state. +key, parent, path) { + if (NONE === path) { + path = [] + } + let out = null == before ? val : before(key, val, parent, path) + maxdepth = null != maxdepth && 0 <= maxdepth ? maxdepth : MAXDEPTH + if (0 === maxdepth || (null != path && 0 < maxdepth && maxdepth <= path.length)) { + return out + } + if (isnode(out)) { + for (let [ckey, child] of items(out)) { + setprop(out, ckey, walk(child, before, after, maxdepth, ckey, out, flatten([getdef(path, []), S_MT + ckey]))) + } + } + out = null == after ? out : after(key, out, parent, path) + return out } - - // Merge a list of values into each other. Later values have // precedence. Nodes override scalars. Node kinds (list or map) // override each other, and do *not* merge. The first element is // modified. -function merge(val) { - let out = UNDEF - - // Handle edge cases. - if (!islist(val)) { - return val - } - - const list = val - const lenlist = list.length - - if (0 === lenlist) { - return UNDEF - } - else if (1 === lenlist) { - return list[0] - } - - // Merge a list of values. - out = getprop(list, 0, {}) - - for (let oI = 1; oI < lenlist; oI++) { - let obj = clone(list[oI]) - - if (!isnode(obj)) { - // Nodes win. - out = obj +function merge(val, maxdepth) { + // const md: number = null == maxdepth ? MAXDEPTH : maxdepth < 0 ? 0 : maxdepth + const md = slice(maxdepth ?? MAXDEPTH, 0) + let out = NONE + // Handle edge cases. + if (!islist(val)) { + return val } - else { - // Nodes win, also over nodes of a different kind. - if (!isnode(out) || (ismap(obj) && islist(out)) || (islist(obj) && ismap(out))) { - out = obj - } - else { - // Node stack. walking down the current obj. - let cur = [out] - let cI = 0 - - function merger( - key, - val, - parent, - path - ) { - if (null == key) { - return val - } - - // Get the curent value at the current path in obj. - // NOTE: this is not exactly efficient, and should be optimised. - let lenpath = path.length - cI = lenpath - 1 - if (UNDEF === cur[cI]) { - cur[cI] = getpath(path.slice(0, lenpath - 1), out) - } - - // Create node if needed. - if (!isnode(cur[cI])) { - cur[cI] = islist(parent) ? [] : {} - } - - // Node child is just ahead of us on the stack, since - // `walk` traverses leaves before nodes. - if (isnode(val) && !isempty(val)) { - setprop(cur[cI], key, cur[cI + 1]) - cur[cI + 1] = UNDEF - } - - // Scalar child. - else { - setprop(cur[cI], key, val) - } - - return val + const list = val + const lenlist = list.length + if (0 === lenlist) { + return NONE + } + else if (1 === lenlist) { + return list[0] + } + // Merge a list of values. + out = getprop(list, 0, {}) + for (let oI = 1; oI < lenlist; oI++) { + let obj = list[oI] + if (!isnode(obj)) { + // Nodes win. + out = obj + } + else { + // Current value at path end in overriding node. + let cur = [out] + // Current value at path end in destination node. + let dst = [out] + function before(key, val, _parent, path) { + const pI = size(path) + if (md <= pI) { + setprop(cur[pI - 1], key, val) + } + // Scalars just override directly. + else if (!isnode(val)) { + cur[pI] = val + } + // Descend into override node - Set up correct target in `after` function. + else { + // Descend into destination node using same key. + dst[pI] = 0 < pI ? getprop(dst[pI - 1], key) : dst[pI] + const tval = dst[pI] + // Destination empty, so create node (unless override is class instance). + if (NONE === tval && 0 === (T_instance & typify(val))) { + cur[pI] = islist(val) ? [] : {} + } + // Matching override and destination so continue with their values. + else if (typify(val) === typify(tval)) { + cur[pI] = tval + } + // Override wins. + else { + cur[pI] = val + // No need to descend when override wins (destination is discarded). + val = NONE + } + } + // console.log('BEFORE-END', pathify(path), '@', pI, key, + // stringify(val, -1, 1), stringify(parent, -1, 1), + // 'CUR=', stringify(cur, -1, 1), 'DST=', stringify(dst, -1, 1)) + return val + } + function after(key, _val, _parent, path) { + const cI = size(path) + const target = cur[cI - 1] + const value = cur[cI] + // console.log('AFTER-PREP', pathify(path), '@', cI, cur, '|', + // stringify(key, -1, 1), stringify(value, -1, 1), 'T=', stringify(target, -1, 1)) + setprop(target, key, value) + return value + } + // Walk overriding node, creating paths in output as needed. + out = walk(obj, before, after, maxdepth) + // console.log('WALK-DONE', out, obj) } - - // Walk overriding node, creating paths in output as needed. - walk(obj, merger) - } } - } - - return out + if (0 === md) { + out = getelem(list, -1) + out = islist(out) ? [] : ismap(out) ? {} : out + } + return out } - - -// Get a value deep inside a node using a key path. For example the -// path `a.b` gets the value 1 from {a:{b:1}}. The path can specified -// as a dotted string, or a string array. If the path starts with a -// dot (or the first element is ''), the path is considered local, and -// resolved against the `current` argument, if defined. Integer path -// parts are used as array indexes. The state argument allows for -// custom handling when called from `inject` or `transform`. -function getpath(path, store, current, state) { - - // Operate on a string array. - const parts = islist(path) ? path : S_string === typeof path ? path.split(S_DT) : UNDEF - - if (UNDEF === parts) { - return UNDEF - } - - let root = store - let val = store - const base = getprop(state, S_base) - - // An empty path (incl empty string) just finds the store. - if (null == path || null == store || (1 === parts.length && S_MT === parts[0])) { - // The actual store data may be in a store sub property, defined by state.base. - val = getprop(store, base, store) - } - else if (0 < parts.length) { - let pI = 0 - - // Relative path uses `current` argument. - if (S_MT === parts[0]) { - pI = 1 - root = current +// Set a value using a path. Missing path parts are created. +// String paths create only maps. Use a string list to create list parts. +function setpath(store, path, val, injdef) { + const pathType = typify(path) + const parts = 0 < (T_list & pathType) ? path : + 0 < (T_string & pathType) ? path.split(S_DT) : + 0 < (T_number & pathType) ? [path] : NONE + if (NONE === parts) { + return NONE } - - let part = pI < parts.length ? parts[pI] : UNDEF - let first = getprop(root, part) - - // At top level, check state.base, if provided - val = (UNDEF === first && 0 === pI) ? - getprop(getprop(root, base), part) : - first - - // Move along the path, trying to descend into the store. - for (pI++; UNDEF !== val && pI < parts.length; pI++) { - val = getprop(val, parts[pI]) + const base = getprop(injdef, S_base) + const numparts = size(parts) + let parent = getprop(store, base, store) + for (let pI = 0; pI < numparts - 1; pI++) { + const partKey = getelem(parts, pI) + let nextParent = getprop(parent, partKey) + if (!isnode(nextParent)) { + nextParent = 0 < (T_number & typify(getelem(parts, pI + 1))) ? [] : {} + setprop(parent, partKey, nextParent) + } + parent = nextParent } - } - - // State may provide a custom handler to modify found value. - if (null != state && isfunc(state.handler)) { - const ref = pathify(path) - val = state.handler(state, val, current, ref, store) - } - - return val + if (DELETE === val) { + delprop(parent, getelem(parts, -1)) + } + else { + setprop(parent, getelem(parts, -1), val) + } + return parent +} +function getpath(store, path, injdef) { + // Operate on a string array. + const parts = islist(path) ? path : + 'string' === typeof path ? path.split(S_DT) : + 'number' === typeof path ? [strkey(path)] : NONE + if (NONE === parts) { + return NONE + } + // let root = store + let val = store + const base = getprop(injdef, S_base) + const src = getprop(store, base, store) + const numparts = size(parts) + const dparent = getprop(injdef, 'dparent') + // An empty path (incl empty string) just finds the store. + if (null == path || null == store || (1 === numparts && S_MT === parts[0])) { + val = src + } + else if (0 < numparts) { + // Check for $ACTIONs + if (1 === numparts) { + val = getprop(store, parts[0]) + } + if (!isfunc(val)) { + val = src + const m = parts[0].match(R_META_PATH) + if (m && injdef && injdef.meta) { + val = getprop(injdef.meta, m[1]) + parts[0] = m[3] + } + const dpath = getprop(injdef, 'dpath') + for (let pI = 0; NONE !== val && pI < numparts; pI++) { + let part = parts[pI] + if (injdef && S_DKEY === part) { + part = getprop(injdef, S_key) + } + else if (injdef && part.startsWith('$GET:')) { + // $GET:path$ -> get store value, use as path part (string) + part = stringify(getpath(src, slice(part, 5, -1))) + } + else if (injdef && part.startsWith('$REF:')) { + // $REF:refpath$ -> get spec value, use as path part (string) + part = stringify(getpath(getprop(store, S_DSPEC), slice(part, 5, -1))) + } + else if (injdef && part.startsWith('$META:')) { + // $META:metapath$ -> get meta value, use as path part (string) + part = stringify(getpath(getprop(injdef, 'meta'), slice(part, 6, -1))) + } + // $$ escapes $ + part = part.replace(R_DOUBLE_DOLLAR, '$') + if (S_MT === part) { + let ascends = 0 + while (S_MT === parts[1 + pI]) { + ascends++ + pI++ + } + if (injdef && 0 < ascends) { + if (pI === parts.length - 1) { + ascends-- + } + if (0 === ascends) { + val = dparent + } + else { + // const fullpath = slice(dpath, 0 - ascends).concat(parts.slice(pI + 1)) + const fullpath = flatten([slice(dpath, 0 - ascends), parts.slice(pI + 1)]) + if (ascends <= size(dpath)) { + val = getpath(store, fullpath) + } + else { + val = NONE + } + break + } + } + else { + val = dparent + } + } + else { + val = getprop(val, part) + } + } + } + } + // Inj may provide a custom handler to modify found value. + const handler = getprop(injdef, 'handler') + if (null != injdef && isfunc(handler)) { + const ref = pathify(path) + val = handler(injdef, val, ref, store) + } + // console.log('GETPATH', path, val) + return val } - - // Inject values from a data store into a node recursively, resolving -// paths against the store, or current if they are local. THe modify -// argument allows custom modification of the result. The state -// (InjectState) argument is used to maintain recursive state. -function inject( - val, - store, - modify, - current, - state -) { - const valtype = typeof val - - // Create state if at root of injection. The input value is placed - // inside a virtual parent holder to simplify edge cases. - if (UNDEF === state) { - const parent = { [S_DTOP]: val } - - // Set up state assuming we are starting in the virtual parent. - state = { - mode: S_MVAL, - full: false, - keyI: 0, - keys: [S_DTOP], - key: S_DTOP, - val, - parent, - path: [S_DTOP], - nodes: [parent], - handler: _injecthandler, - base: S_DTOP, - modify, - errs: getprop(store, S_DERRS, []), - meta: {}, - } - } - - // Resolve current node in store for local paths. - if (UNDEF === current) { - current = { $TOP: store } - } - else { - const parentkey = getprop(state.path, state.path.length - 2) - current = null == parentkey ? current : getprop(current, parentkey) - } - - // Descend into node. - if (isnode(val)) { - - // Keys are sorted alphanumerically to ensure determinism. - // Injection transforms ($FOO) are processed *after* other keys. - // NOTE: the optional digits suffix of the transform can thus be - // used to order the transforms. - let nodekeys = ismap(val) ? [ - ...Object.keys(val).filter(k => !k.includes(S_DS)).sort(), - ...Object.keys(val).filter(k => k.includes(S_DS)).sort(), - ] : val.map((_n, i) => i) - - // Each child key-value pair is processed in three injection phases: - // 1. state.mode='key:pre' - Key string is injected, returning a possibly altered key. - // 2. state.mode='val' - The child value is injected. - // 3. state.mode='key:post' - Key string is injected again, allowing child mutation. - for (let nkI = 0; nkI < nodekeys.length; nkI++) { - const nodekey = S_MT + nodekeys[nkI] - - // let child = parent[nodekey] - let childpath = [...(state.path || []), nodekey] - let childnodes = [...(state.nodes || []), val] - let childval = getprop(val, nodekey) - - const childstate = { - mode: S_MKEYPRE, - full: false, - keyI: nkI, - keys: nodekeys, - key: nodekey, - val: childval, - parent: val, - path: childpath, - nodes: childnodes, - handler: _injecthandler, - base: state.base, - errs: state.errs, - meta: state.meta, - } - - // Peform the key:pre mode injection on the child key. - const prekey = _injectstr(nodekey, store, current, childstate) - - // The injection may modify child processing. - nkI = childstate.keyI - nodekeys = childstate.keys - - // Prevent further processing by returning an undefined prekey - if (UNDEF !== prekey) { - childstate.val = childval = getprop(val, prekey) - childstate.mode = S_MVAL - - // Perform the val mode injection on the child value. - // NOTE: return value is not used. - inject(childval, store, modify, current, childstate) - - // The injection may modify child processing. - nkI = childstate.keyI - nodekeys = childstate.keys - - // Peform the key:post mode injection on the child key. - childstate.mode = S_MKEYPOST - _injectstr(nodekey, store, current, childstate) - - // The injection may modify child processing. - nkI = childstate.keyI - nodekeys = childstate.keys - } +// paths against the store, or current if they are local. The modify +// argument allows custom modification of the result. The inj +// (Injection) argument is used to maintain recursive state. +function inject(val, store, injdef) { + const valtype = typeof val + let inj = injdef + // Create state if at root of injection. The input value is placed + // inside a virtual parent holder to simplify edge cases. + if (NONE === injdef || null == injdef.mode) { + // Set up state assuming we are starting in the virtual parent. + inj = new Injection(val, { [S_DTOP]: val }) + inj.dparent = store + inj.errs = getprop(store, S_DERRS, []) + inj.meta.__d = 0 + if (NONE !== injdef) { + inj.modify = null == injdef.modify ? inj.modify : injdef.modify + inj.extra = null == injdef.extra ? inj.extra : injdef.extra + inj.meta = null == injdef.meta ? inj.meta : injdef.meta + inj.handler = null == injdef.handler ? inj.handler : injdef.handler + } } - } - - // Inject paths into string scalars. - else if (S_string === valtype) { - state.mode = S_MVAL - val = _injectstr(val, store, current, state) - - setprop(state.parent, state.key, val) - } - - // Custom modification. - if (modify) { - let mkey = state.key - let mparent = state.parent - let mval = getprop(mparent, mkey) - modify( - mval, - mkey, - mparent, - state, - current, - store - ) - } - - // Original val reference may no longer be correct. - // This return value is only used as the top level result. - return getprop(state.parent, S_DTOP) + inj.descend() + // console.log('INJ-START', val, inj.mode, inj.key, inj.val, + // 't=', inj.path, 'P=', inj.parent, 'dp=', inj.dparent, 'ST=', store.$TOP) + // Descend into node. + if (isnode(val)) { + // Keys are sorted alphanumerically to ensure determinism. + // Injection transforms ($FOO) are processed *after* other keys. + // NOTE: the optional digits suffix of the transform can thus be + // used to order the transforms. + let nodekeys + nodekeys = keysof(val) + if (ismap(val)) { + nodekeys = flatten([ + filter(nodekeys, (n => !n[1].includes(S_DS))), + filter(nodekeys, (n => n[1].includes(S_DS))), + ]) + } + else { + nodekeys = keysof(val) + } + // Each child key-value pair is processed in three injection phases: + // 1. inj.mode=M_KEYPRE - Key string is injected, returning a possibly altered key. + // 2. inj.mode=M_VAL - The child value is injected. + // 3. inj.mode=M_KEYPOST - Key string is injected again, allowing child mutation. + for (let nkI = 0; nkI < nodekeys.length; nkI++) { + const childinj = inj.child(nkI, nodekeys) + const nodekey = childinj.key + childinj.mode = M_KEYPRE + // Peform the key:pre mode injection on the child key. + const prekey = _injectstr(nodekey, store, childinj) + // The injection may modify child processing. + nkI = childinj.keyI + nodekeys = childinj.keys + // Prevent further processing by returning an undefined prekey + if (NONE !== prekey) { + childinj.val = getprop(val, prekey) + childinj.mode = M_VAL + // Perform the val mode injection on the child value. + // NOTE: return value is not used. + inject(childinj.val, store, childinj) + // The injection may modify child processing. + nkI = childinj.keyI + nodekeys = childinj.keys + // Peform the key:post mode injection on the child key. + childinj.mode = M_KEYPOST + _injectstr(nodekey, store, childinj) + // The injection may modify child processing. + nkI = childinj.keyI + nodekeys = childinj.keys + } + } + } + // Inject paths into string scalars. + else if (S_string === valtype) { + inj.mode = M_VAL + val = _injectstr(val, store, inj) + if (SKIP !== val) { + inj.setval(val) + } + } + // Custom modification. + if (inj.modify && SKIP !== val) { + let mkey = inj.key + let mparent = inj.parent + let mval = getprop(mparent, mkey) + inj.modify(mval, mkey, mparent, inj, store) + } + // console.log('INJ-VAL', val) + inj.val = val + // Original val reference may no longer be correct. + // This return value is only used as the top level result. + return getprop(inj.parent, S_DTOP) } - - // The transform_* functions are special command inject handlers (see Injector). - // Delete a key from a map or list. -const transform_DELETE = (state) => { - _setparentprop(state, UNDEF) - return UNDEF +const transform_DELETE = (inj) => { + inj.setval(NONE) + return NONE } - - // Copy value from source data. -const transform_COPY = (state, _val, current) => { - const { mode, key } = state - - let out = key - if (!mode.startsWith(S_MKEY)) { - out = getprop(current, key) - _setparentprop(state, out) - } - - return out +const transform_COPY = (inj, _val) => { + const ijname = 'COPY' + if (!checkPlacement(M_VAL, ijname, T_any, inj)) { + return NONE + } + let out = getprop(inj.dparent, inj.key) + inj.setval(out) + return out } - - // As a value, inject the key of the parent node. // As a key, defined the name of the key property in the source object. -const transform_KEY = (state, _val, current) => { - const { mode, path, parent } = state - - // Do nothing in val mode. - if (S_MVAL !== mode) { - return UNDEF - } - - // Key is defined by $KEY meta property. - const keyspec = getprop(parent, S_DKEY) - if (UNDEF !== keyspec) { - setprop(parent, S_DKEY, UNDEF) - return getprop(current, keyspec) - } - - // Key is defined within general purpose $META object. - return getprop(getprop(parent, S_DMETA), S_KEY, getprop(path, path.length - 2)) +const transform_KEY = (inj) => { + const { mode, path, parent } = inj + // Do nothing in val mode - not an error. + if (M_VAL !== mode) { + return NONE + } + // Key is defined by $KEY meta property. + const keyspec = getprop(parent, S_BKEY) + if (NONE !== keyspec) { + delprop(parent, S_BKEY) + return getprop(inj.dparent, keyspec) + } + // Key is defined within general purpose $META object. + // return getprop(getprop(parent, S_BANNO), S_KEY, getprop(path, path.length - 2)) + return getprop(getprop(parent, S_BANNO), S_KEY, getelem(path, -2)) } - - -// Store meta data about a node. Does nothing itself, just used by +// Annotate node. Does nothing itself, just used by // other injectors, and is removed when called. -const transform_META = (state) => { - const { parent } = state - setprop(parent, S_DMETA, UNDEF) - return UNDEF +const transform_ANNO = (inj) => { + const { parent } = inj + delprop(parent, S_BANNO) + return NONE } - - // Merge a list of objects into the current object. // Must be a key in an object. The value is merged over the current object. // If the value is an array, the elements are first merged using `merge`. // If the value is the empty string, merge the top level store. // Format: { '`$MERGE`': '`source-path`' | ['`source-paths`', ...] } -const transform_MERGE = ( - state, _val, current -) => { - const { mode, key, parent } = state - - if (S_MKEYPRE === mode) { return key } - - // Operate after child values have been transformed. - if (S_MKEYPOST === mode) { - - let args = getprop(parent, key) - args = S_MT === args ? [current.$TOP] : Array.isArray(args) ? args : [args] - - // Remove the $MERGE command from a parent map. - _setparentprop(state, UNDEF) - - // Literals in the parent have precedence, but we still merge onto - // the parent object, so that node tree references are not changed. - const mergelist = [parent, ...args, clone(parent)] - - merge(mergelist) - - return key - } - - // Ensures $MERGE is removed from parent list. - return UNDEF +const transform_MERGE = (inj) => { + const { mode, key, parent } = inj + // Ensures $MERGE is removed from parent list (val mode). + let out = NONE + if (M_KEYPRE === mode) { + out = key + } + // Operate after child values have been transformed. + else if (M_KEYPOST === mode) { + out = key + let args = getprop(parent, key) + args = Array.isArray(args) ? args : [args] + // Remove the $MERGE command from a parent map. + inj.setval(NONE) + // Literals in the parent have precedence, but we still merge onto + // the parent object, so that node tree references are not changed. + const mergelist = flatten([[parent], args, [clone(parent)]]) + merge(mergelist) + } + return out } - - // Convert a node to a list. // Format: ['`$EACH`', '`source-path-of-node`', child-template] -const transform_EACH = ( - state, - _val, - current, - _ref, - store -) => { - // Remove arguments to avoid spurious processing. - if (null != state.keys) { - state.keys.length = 1 - } - - if (S_MVAL !== state.mode) { - return UNDEF - } - - // Get arguments: ['`$EACH`', 'source-path', child-template]. - const srcpath = getprop(state.parent, 1) - const child = clone(getprop(state.parent, 2)) - - // Source data. - // const src = getpath(srcpath, store, current, state) - const srcstore = getprop(store, state.base, store) - const src = getpath(srcpath, srcstore, current) - - // Create parallel data structures: - // source entries :: child templates - let tcur = [] - let tval = [] - - const tkey = state.path[state.path.length - 2] - const target = state.nodes[state.path.length - 2] || state.nodes[state.path.length - 1] - - // Create clones of the child template for each value of the current soruce. - if (islist(src)) { - tval = src.map(() => clone(child)) - } - else if (ismap(src)) { - tval = Object.entries(src).map(n => ({ - ...clone(child), - - // Make a note of the key for $KEY transforms. - [S_DMETA]: { KEY: n[0] } - })) - } - - tcur = null == src ? UNDEF : Object.values(src) - - // Parent structure. - tcur = { $TOP: tcur } - - // Build the substructure. - tval = inject(tval, store, state.modify, tcur) - - _updateAncestors(state, target, tkey, tval) - - // Prevent callee from damaging first list entry (since we are in `val` mode). - return tval[0] +const transform_EACH = (inj, _val, _ref, store) => { + const ijname = 'EACH' + if (!checkPlacement(M_VAL, ijname, T_list, inj)) { + return NONE + } + // Remove remaining keys to avoid spurious processing. + slice(inj.keys, 0, 1, true) + // const [err, srcpath, child] = injectorArgs([T_string, T_any], inj) + const [err, srcpath, child] = injectorArgs([T_string, T_any], slice(inj.parent, 1)) + if (NONE !== err) { + inj.errs.push('$' + ijname + ': ' + err) + return NONE + } + // Source data. + const srcstore = getprop(store, inj.base, store) + const src = getpath(srcstore, srcpath, inj) + const srctype = typify(src) + // Create parallel data structures: + // source entries :: child templates + let tcur = [] + let tval = [] + const tkey = getelem(inj.path, -2) + const target = getelem(inj.nodes, -2, () => getelem(inj.nodes, -1)) + // Create clones of the child template for each value of the current soruce. + if (0 < (T_list & srctype)) { + tval = items(src, () => clone(child)) + } + else if (0 < (T_map & srctype)) { + tval = items(src, (n => merge([ + clone(child), + // Make a note of the key for $KEY transforms. + { [S_BANNO]: { KEY: n[0] } } + ], 1))) + } + let rval = [] + if (0 < size(tval)) { + tcur = null == src ? NONE : Object.values(src) + const ckey = getelem(inj.path, -2) + const tpath = slice(inj.path, -1) + const dpath = flatten([S_DTOP, srcpath.split(S_DT), '$:' + ckey]) + // Parent structure. + tcur = { [ckey]: tcur } + if (1 < size(tpath)) { + const pkey = getelem(inj.path, -3, S_DTOP) + tcur = { [pkey]: tcur } + dpath.push('$:' + pkey) + } + const tinj = inj.child(0, [ckey]) + tinj.path = tpath + tinj.nodes = slice(inj.nodes, -1) + tinj.parent = getelem(tinj.nodes, -1) + setprop(tinj.parent, ckey, tval) + tinj.val = tval + tinj.dpath = dpath + tinj.dparent = tcur + inject(tval, store, tinj) + rval = tinj.val + } + // _updateAncestors(inj, target, tkey, rval) + setprop(target, tkey, rval) + // Prevent callee from damaging first list entry (since we are in `val` mode). + return rval[0] } - - - // Convert a node to a map. -// Format: { '`$PACK`':['`source-path`', child-template]} -const transform_PACK = ( - state, - _val, - current, - ref, - store -) => { - const { mode, key, path, parent, nodes } = state - - // Defensive context checks. - if (S_MKEYPRE !== mode || S_string !== typeof key || null == path || null == nodes) { - return UNDEF - } - - // Get arguments. - const args = parent[key] - const srcpath = args[0] // Path to source data. - const child = clone(args[1]) // Child template. - - // Find key and target node. - const keyprop = child[S_DKEY] - const tkey = path[path.length - 2] - const target = nodes[path.length - 2] || nodes[path.length - 1] - - // Source data - // const srcstore = getprop(store, getprop(state, S_base), store) - const srcstore = getprop(store, state.base, store) - let src = getpath(srcpath, srcstore, current) - // let src = getpath(srcpath, store, current, state) - - // Prepare source as a list. - src = islist(src) ? src : - ismap(src) ? Object.entries(src) - .reduce((a, n) => - (n[1][S_DMETA] = { KEY: n[0] }, a.push(n[1]), a), []) : - UNDEF - - if (null == src) { - return UNDEF - } - - // Get key if specified. - let childkey = getprop(child, S_DKEY) - let keyname = UNDEF === childkey ? keyprop : childkey - setprop(child, S_DKEY, UNDEF) - - // Build parallel target object. - let tval = {} - tval = src.reduce((a, n) => { - let kn = getprop(n, keyname) - setprop(a, kn, clone(child)) - const nchild = getprop(a, kn) - setprop(nchild, S_DMETA, getprop(n, S_DMETA)) - return a - }, tval) - - // Build parallel source object. - let tcurrent = {} - src.reduce((a, n) => { - let kn = getprop(n, keyname) - setprop(a, kn, n) - return a - }, tcurrent) - - tcurrent = { $TOP: tcurrent } - - // Build substructure. - tval = inject( - tval, - store, - state.modify, - tcurrent, - ) - - _updateAncestors(state, target, tkey, tval) - - // Drop transform key. - return UNDEF +// Format: { '`$PACK`':['source-path', child-template]} +const transform_PACK = (inj, _val, _ref, store) => { + const { mode, key, path, parent, nodes } = inj + const ijname = 'EACH' + if (!checkPlacement(M_KEYPRE, ijname, T_map, inj)) { + return NONE + } + // Get arguments. + const args = getprop(parent, key) + const [err, srcpath, origchildspec] = injectorArgs([T_string, T_any], args) + if (NONE !== err) { + inj.errs.push('$' + ijname + ': ' + err) + return NONE + } + // Find key and target node. + const tkey = getelem(path, -2) + const pathsize = size(path) + const target = getelem(nodes, pathsize - 2, () => getelem(nodes, pathsize - 1)) + // Source data + const srcstore = getprop(store, inj.base, store) + let src = getpath(srcstore, srcpath, inj) + // Prepare source as a list. + if (!islist(src)) { + if (ismap(src)) { + src = items(src, (item) => { + setprop(item[1], S_BANNO, { KEY: item[0] }) + return item[1] + }) + } + else { + src = NONE + } + } + if (null == src) { + return NONE + } + // Get keypath. + const keypath = getprop(origchildspec, S_BKEY) + const childspec = delprop(origchildspec, S_BKEY) + const child = getprop(childspec, S_BVAL, childspec) + // Build parallel target object. + let tval = {} + items(src, (item) => { + const srckey = item[0] + const srcnode = item[1] + let key = srckey + if (NONE !== keypath) { + if (keypath.startsWith('`')) { + key = inject(keypath, merge([{}, store, { $TOP: srcnode }], 1)) + } + else { + key = getpath(srcnode, keypath, inj) + } + } + const tchild = clone(child) + setprop(tval, key, tchild) + const anno = getprop(srcnode, S_BANNO) + if (NONE === anno) { + delprop(tchild, S_BANNO) + } + else { + setprop(tchild, S_BANNO, anno) + } + }) + let rval = {} + if (!isempty(tval)) { + // Build parallel source object. + let tsrc = {} + src.reduce((a, n, i) => { + let kn = null == keypath ? i : + keypath.startsWith('`') ? + inject(keypath, merge([{}, store, { $TOP: n }], 1)) : + getpath(n, keypath, inj) + setprop(a, kn, n) + return a + }, tsrc) + const tpath = slice(inj.path, -1) + const ckey = getelem(inj.path, -2) + const dpath = flatten([S_DTOP, srcpath.split(S_DT), '$:' + ckey]) + let tcur = { [ckey]: tsrc } + if (1 < size(tpath)) { + const pkey = getelem(inj.path, -3, S_DTOP) + tcur = { [pkey]: tcur } + dpath.push('$:' + pkey) + } + const tinj = inj.child(0, [ckey]) + tinj.path = tpath + tinj.nodes = slice(inj.nodes, -1) + tinj.parent = getelem(tinj.nodes, -1) + tinj.val = tval + tinj.dpath = dpath + tinj.dparent = tcur + inject(tval, store, tinj) + rval = tinj.val + } + // _updateAncestors(inj, target, tkey, rval) + setprop(target, tkey, rval) + // Drop transform key. + return NONE } - - -// Transform data using spec. -// Only operates on static JSON-like data. -// Arrays are treated as if they are objects with indices as keys. -function transform( - data, // Source data to transform into new data (original not mutated) - spec, // Transform specification; output follows this shape - extra, // Additional store of data and transforms. - modify // Optionally modify individual values. -) { - // Clone the spec so that the clone can be modified in place as the transform result. - spec = clone(spec) - - const extraTransforms = {} - const extraData = null == extra ? UNDEF : items(extra) - .reduce((a, n) => - (n[0].startsWith(S_DS) ? extraTransforms[n[0]] = n[1] : (a[n[0]] = n[1]), a), {}) - - const dataClone = merge([ - isempty(extraData) ? UNDEF : clone(extraData), - clone(data), - ]) - - // Define a top level store that provides transform operations. - const store = { - - // The inject function recognises this special location for the root of the source data. - // NOTE: to escape data that contains "`$FOO`" keys at the top level, - // place that data inside a holding map: { myholder: mydata }. - $TOP: dataClone, - - // Escape backtick (this also works inside backticks). - $BT: () => S_BT, - - // Escape dollar sign (this also works inside backticks). - $DS: () => S_DS, - - // Insert current date and time as an ISO string. - $WHEN: () => new Date().toISOString(), - - $DELETE: transform_DELETE, - $COPY: transform_COPY, - $KEY: transform_KEY, - $META: transform_META, - $MERGE: transform_MERGE, - $EACH: transform_EACH, - $PACK: transform_PACK, - - // Custom extra transforms, if any. - ...extraTransforms, - } - - const out = inject(spec, store, modify, store) - return out +// TODO: not found ref should removed key (setprop NONE) +// Reference original spec (enables recursice transformations) +// Format: ['`$REF`', '`spec-path`'] +const transform_REF = (inj, val, _ref, store) => { + const { nodes } = inj + if (M_VAL !== inj.mode) { + return NONE + } + // Get arguments: ['`$REF`', 'ref-path']. + const refpath = getprop(inj.parent, 1) + inj.keyI = size(inj.keys) + // Spec reference. + const spec = getprop(store, S_DSPEC)() + const dpath = slice(inj.path, 1) + const ref = getpath(spec, refpath, { + // TODO: test relative refs + // dpath: inj.path.slice(1), + dpath, + // dparent: getpath(spec, inj.path.slice(1)) + dparent: getpath(spec, dpath), + }) + let hasSubRef = false + if (isnode(ref)) { + walk(ref, (_k, v) => { + if ('`$REF`' === v) { + hasSubRef = true + } + return v + }) + } + let tref = clone(ref) + const cpath = slice(inj.path, -3) + const tpath = slice(inj.path, -1) + let tcur = getpath(store, cpath) + let tval = getpath(store, tpath) + let rval = NONE + if (!hasSubRef || NONE !== tval) { + const tinj = inj.child(0, [getelem(tpath, -1)]) + tinj.path = tpath + tinj.nodes = slice(inj.nodes, -1) + tinj.parent = getelem(nodes, -2) + tinj.val = tref + tinj.dpath = flatten([cpath]) + tinj.dparent = tcur + inject(tref, store, tinj) + rval = tinj.val + } + else { + rval = NONE + } + const grandparent = inj.setval(rval, 2) + if (islist(grandparent) && inj.prior) { + inj.prior.keyI-- + } + return val } - - -// A required string value. NOTE: Rejects empty strings. -const validate_STRING = (state, _val, current) => { - let out = getprop(current, state.key) - - const t = typify(out) - if (S_string !== t) { - let msg = _invalidTypeMsg(state.path, S_string, t, out) - state.errs.push(msg) - return UNDEF - } - - if (S_MT === out) { - let msg = 'Empty string at ' + pathify(state.path, 1) - state.errs.push(msg) - return UNDEF - } - - return out +const transform_FORMAT = (inj, _val, _ref, store) => { + // console.log('FORMAT-START', inj, _val) + // Remove remaining keys to avoid spurious processing. + slice(inj.keys, 0, 1, true) + if (M_VAL !== inj.mode) { + return NONE + } + // Get arguments: ['`$FORMAT`', 'name', child]. + // TODO: EACH and PACK should accept customm functions too + const name = getprop(inj.parent, 1) + const child = getprop(inj.parent, 2) + // Source data. + const tkey = getelem(inj.path, -2) + const target = getelem(inj.nodes, -2, () => getelem(inj.nodes, -1)) + const cinj = injectChild(child, store, inj) + const resolved = cinj.val + let formatter = 0 < (T_function & typify(name)) ? name : getprop(FORMATTER, name) + if (NONE === formatter) { + inj.errs.push('$FORMAT: unknown format: ' + name + '.') + return NONE + } + let out = walk(resolved, formatter) + setprop(target, tkey, out) + // _updateAncestors(inj, target, tkey, out) + return out } - - -// A required number value (int or float). -const validate_NUMBER = (state, _val, current) => { - let out = getprop(current, state.key) - - const t = typify(out) - if (S_number !== t) { - state.errs.push(_invalidTypeMsg(state.path, S_number, t, out)) - return UNDEF - } - - return out +const FORMATTER = { + identity: (_k, v) => v, + upper: (_k, v) => isnode(v) ? v : ('' + v).toUpperCase(), + lower: (_k, v) => isnode(v) ? v : ('' + v).toLowerCase(), + string: (_k, v) => isnode(v) ? v : ('' + v), + number: (_k, v) => { + if (isnode(v)) { + return v + } + else { + let n = Number(v) + if (isNaN(n)) { + n = 0 + } + return n + } + }, + integer: (_k, v) => { + if (isnode(v)) { + return v + } + else { + let n = Number(v) + if (isNaN(n)) { + n = 0 + } + return n | 0 + } + }, + concat: (k, v) => null == k && islist(v) ? join(items(v, (n => isnode(n[1]) ? S_MT : (S_MT + n[1]))), S_MT) : v } - - -// A required boolean value. -const validate_BOOLEAN = (state, _val, current) => { - let out = getprop(current, state.key) - - const t = typify(out) - if (S_boolean !== t) { - state.errs.push(_invalidTypeMsg(state.path, S_boolean, t, out)) - return UNDEF - } - - return out +const transform_APPLY = (inj, _val, _ref, store) => { + const ijname = 'APPLY' + if (!checkPlacement(M_VAL, ijname, T_list, inj)) { + return NONE + } + // const [err, apply, child] = injectorArgs([T_function, T_any], inj) + const [err, apply, child] = injectorArgs([T_function, T_any], slice(inj.parent, 1)) + if (NONE !== err) { + inj.errs.push('$' + ijname + ': ' + err) + return NONE + } + const tkey = getelem(inj.path, -2) + const target = getelem(inj.nodes, -2, () => getelem(inj.nodes, -1)) + const cinj = injectChild(child, store, inj) + const resolved = cinj.val + const out = apply(resolved, store, cinj) + setprop(target, tkey, out) + return out } - - -// A required object (map) value (contents not validated). -const validate_OBJECT = (state, _val, current) => { - let out = getprop(current, state.key) - - const t = typify(out) - if (t !== S_object) { - state.errs.push(_invalidTypeMsg(state.path, S_object, t, out)) - return UNDEF - } - - return out +// Transform data using spec. +// Only operates on static JSON-like data. +// Arrays are treated as if they are objects with indices as keys. +function transform(data, // Source data to transform into new data (original not mutated) +spec, // Transform specification; output follows this shape +injdef) { + // Clone the spec so that the clone can be modified in place as the transform result. + const origspec = spec + spec = clone(origspec) + const extra = injdef?.extra + const collect = null != injdef?.errs + const errs = injdef?.errs || [] + const extraTransforms = {} + const extraData = null == extra ? NONE : items(extra) + .reduce((a, n) => (n[0].startsWith(S_DS) ? extraTransforms[n[0]] = n[1] : (a[n[0]] = n[1]), a), {}) + const dataClone = merge([ + isempty(extraData) ? NONE : clone(extraData), + clone(data), + ]) + // Define a top level store that provides transform operations. + const store = merge([ + { + // The inject function recognises this special location for the root of the source data. + // NOTE: to escape data that contains "`$FOO`" keys at the top level, + // place that data inside a holding map: { myholder: mydata }. + $TOP: dataClone, + $SPEC: () => origspec, + // Escape backtick (this also works inside backticks). + $BT: () => S_BT, + // Escape dollar sign (this also works inside backticks). + $DS: () => S_DS, + // Insert current date and time as an ISO string. + $WHEN: () => new Date().toISOString(), + $DELETE: transform_DELETE, + $COPY: transform_COPY, + $KEY: transform_KEY, + $ANNO: transform_ANNO, + $MERGE: transform_MERGE, + $EACH: transform_EACH, + $PACK: transform_PACK, + $REF: transform_REF, + $FORMAT: transform_FORMAT, + $APPLY: transform_APPLY, + }, + // Custom extra transforms, if any. + extraTransforms, + { + $ERRS: errs, + } + ], 1) + const out = inject(spec, store, injdef) + const generr = (0 < size(errs) && !collect) + if (generr) { + throw new Error(join(errs, ' | ')) + } + return out } - - -// A required array (list) value (contents not validated). -const validate_ARRAY = (state, _val, current) => { - let out = getprop(current, state.key) - - const t = typify(out) - if (t !== S_array) { - state.errs.push(_invalidTypeMsg(state.path, S_array, t, out)) - return UNDEF - } - - return out +// A required string value. NOTE: Rejects empty strings. +const validate_STRING = (inj) => { + let out = getprop(inj.dparent, inj.key) + const t = typify(out) + if (0 === (T_string & t)) { + let msg = _invalidTypeMsg(inj.path, S_string, t, out, 'V1010') + inj.errs.push(msg) + return NONE + } + if (S_MT === out) { + let msg = 'Empty string at ' + pathify(inj.path, 1) + inj.errs.push(msg) + return NONE + } + return out } - - -// A required function value. -const validate_FUNCTION = (state, _val, current) => { - let out = getprop(current, state.key) - - const t = typify(out) - if (S_function !== t) { - state.errs.push(_invalidTypeMsg(state.path, S_function, t, out)) - return UNDEF - } - - return out +const validate_TYPE = (inj, _val, ref) => { + const tname = slice(ref, 1).toLowerCase() + const typev = 1 << (31 - TYPENAME.indexOf(tname)) + let out = getprop(inj.dparent, inj.key) + const t = typify(out) + // console.log('TYPE', tname, typev, tn(typev), 'O=', t, tn(t), out, 'C=', t & typev) + if (0 === (t & typev)) { + inj.errs.push(_invalidTypeMsg(inj.path, tname, t, out, 'V1001')) + return NONE + } + return out } - - // Allow any value. -const validate_ANY = (state, _val, current) => { - return getprop(current, state.key) +const validate_ANY = (inj) => { + let out = getprop(inj.dparent, inj.key) + return out } - - - // Specify child values for map or list. // Map syntax: {'`$CHILD`': child-template } // List syntax: ['`$CHILD`', child-template ] -const validate_CHILD = (state, _val, current) => { - const { mode, key, parent, keys, path } = state - - // Setup data structures for validation by cloning child template. - - // Map syntax. - if (S_MKEYPRE === mode) { - const childtm = getprop(parent, key) - - // Get corresponding current object. - const pkey = getprop(path, path.length - 2) - let tval = getprop(current, pkey) - - if (UNDEF == tval) { - tval = {} - } - else if (!ismap(tval)) { - state.errs.push(_invalidTypeMsg( - state.path.slice(0, state.path.length - 1), S_object, typify(tval), tval)) - return UNDEF +const validate_CHILD = (inj) => { + const { mode, key, parent, keys, path } = inj + // Setup data structures for validation by cloning child template. + // Map syntax. + if (M_KEYPRE === mode) { + const childtm = getprop(parent, key) + // Get corresponding current object. + const pkey = getelem(path, -2) + let tval = getprop(inj.dparent, pkey) + if (NONE == tval) { + tval = {} + } + else if (!ismap(tval)) { + inj.errs.push(_invalidTypeMsg(slice(inj.path, -1), S_object, typify(tval), tval), 'V0220') + return NONE + } + const ckeys = keysof(tval) + for (let ckey of ckeys) { + setprop(parent, ckey, clone(childtm)) + // NOTE: modifying inj! This extends the child value loop in inject. + keys.push(ckey) + } + // Remove $CHILD to cleanup ouput. + inj.setval(NONE) + return NONE } - - const ckeys = keysof(tval) - for (let ckey of ckeys) { - setprop(parent, ckey, clone(childtm)) - - // NOTE: modifying state! This extends the child value loop in inject. - keys.push(ckey) + // List syntax. + if (M_VAL === mode) { + if (!islist(parent)) { + // $CHILD was not inside a list. + inj.errs.push('Invalid $CHILD as value') + return NONE + } + const childtm = getprop(parent, 1) + if (NONE === inj.dparent) { + // Empty list as default. + // parent.length = 0 + slice(parent, 0, 0, true) + return NONE + } + if (!islist(inj.dparent)) { + const msg = _invalidTypeMsg(slice(inj.path, -1), S_list, typify(inj.dparent), inj.dparent, 'V0230') + inj.errs.push(msg) + inj.keyI = size(parent) + return inj.dparent + } + // Clone children abd reset inj key index. + // The inject child loop will now iterate over the cloned children, + // validating them againt the current list values. + items(inj.dparent, (n) => setprop(parent, n[0], clone(childtm))) + slice(parent, 0, inj.dparent.length, true) + inj.keyI = 0 + const out = getprop(inj.dparent, 0) + return out } - - // Remove $CHILD to cleanup ouput. - _setparentprop(state, UNDEF) - return UNDEF - } - - // List syntax. - if (S_MVAL === mode) { - - if (!islist(parent)) { - // $CHILD was not inside a list. - state.errs.push('Invalid $CHILD as value') - return UNDEF + return NONE +} +// TODO: implement SOME, ALL +// FIX: ONE should mean exactly one, not at least one (=SOME) +// TODO: implement a generate validate_ALT to do all of these +// Match at least one of the specified shapes. +// Syntax: ['`$ONE`', alt0, alt1, ...] +const validate_ONE = (inj, _val, _ref, store) => { + const { mode, parent, keyI } = inj + // Only operate in val mode, since parent is a list. + if (M_VAL === mode) { + if (!islist(parent) || 0 !== keyI) { + inj.errs.push('The $ONE validator at field ' + + pathify(inj.path, 1, 1) + + ' must be the first element of an array.') + return + } + inj.keyI = size(inj.keys) + // Clean up structure, replacing [$ONE, ...] with current + inj.setval(inj.dparent, 2) + inj.path = slice(inj.path, -1) + inj.key = getelem(inj.path, -1) + let tvals = slice(parent, 1) + if (0 === size(tvals)) { + inj.errs.push('The $ONE validator at field ' + + pathify(inj.path, 1, 1) + + ' must have at least one argument.') + return + } + // See if we can find a match. + for (let tval of tvals) { + // If match, then errs.length = 0 + let terrs = [] + const vstore = merge([{}, store], 1) + vstore.$TOP = inj.dparent + const vcurrent = validate(inj.dparent, tval, { + extra: vstore, + errs: terrs, + meta: inj.meta, + }) + inj.setval(vcurrent, -2) + // Accept current value if there was a match + if (0 === size(terrs)) { + return + } + } + // There was no match. + const valdesc = replace(join(items(tvals, (n) => stringify(n[1])), ', '), R_TRANSFORM_NAME, (_m, p1) => p1.toLowerCase()) + inj.errs.push(_invalidTypeMsg(inj.path, (1 < size(tvals) ? 'one of ' : '') + valdesc, typify(inj.dparent), inj.dparent, 'V0210')) } - - const childtm = getprop(parent, 1) - - if (UNDEF === current) { - // Empty list as default. - parent.length = 0 - return UNDEF +} +const validate_EXACT = (inj) => { + const { mode, parent, key, keyI } = inj + // Only operate in val mode, since parent is a list. + if (M_VAL === mode) { + if (!islist(parent) || 0 !== keyI) { + inj.errs.push('The $EXACT validator at field ' + + pathify(inj.path, 1, 1) + + ' must be the first element of an array.') + return + } + inj.keyI = size(inj.keys) + // Clean up structure, replacing [$EXACT, ...] with current data parent + inj.setval(inj.dparent, 2) + // inj.path = slice(inj.path, 0, size(inj.path) - 1) + inj.path = slice(inj.path, 0, -1) + inj.key = getelem(inj.path, -1) + let tvals = slice(parent, 1) + if (0 === size(tvals)) { + inj.errs.push('The $EXACT validator at field ' + + pathify(inj.path, 1, 1) + + ' must have at least one argument.') + return + } + // See if we can find an exact value match. + let currentstr = undefined + for (let tval of tvals) { + let exactmatch = tval === inj.dparent + if (!exactmatch && isnode(tval)) { + currentstr = undefined === currentstr ? stringify(inj.dparent) : currentstr + const tvalstr = stringify(tval) + exactmatch = tvalstr === currentstr + } + if (exactmatch) { + return + } + } + // There was no match. + const valdesc = replace(join(items(tvals, (n) => stringify(n[1])), ', '), R_TRANSFORM_NAME, (_m, p1) => p1.toLowerCase()) + inj.errs.push(_invalidTypeMsg(inj.path, (1 < size(inj.path) ? '' : 'value ') + + 'exactly equal to ' + (1 === size(tvals) ? '' : 'one of ') + valdesc, typify(inj.dparent), inj.dparent, 'V0110')) } - - if (!islist(current)) { - const msg = _invalidTypeMsg( - state.path.slice(0, state.path.length - 1), S_array, typify(current), current) - state.errs.push(msg) - state.keyI = parent.length - return current + else { + delprop(parent, key) } - - // Clone children abd reset state key index. - // The inject child loop will now iterate over the cloned children, - // validating them againt the current list values. - current.map((_n, i) => parent[i] = clone(childtm)) - parent.length = current.length - state.keyI = 0 - const out = getprop(current, 0) - return out - } - - return UNDEF } - - -// Match at least one of the specified shapes. -// Syntax: ['`$ONE`', alt0, alt1, ...]okI -const validate_ONE = ( - state, - _val, - current, - _ref, - store, -) => { - const { mode, parent, path, keyI, nodes } = state - - // Only operate in val mode, since parent is a list. - if (S_MVAL === mode) { - if (!islist(parent) || 0 !== keyI) { - state.errs.push('The $ONE validator at field ' + - pathify(state.path, 1, 1) + - ' must be the first element of an array.') - return - } - - state.keyI = state.keys.length - - const grandparent = nodes[nodes.length - 2] - const grandkey = path[path.length - 2] - - // Clean up structure, replacing [$ONE, ...] with current - setprop(grandparent, grandkey, current) - state.path = state.path.slice(0, state.path.length - 1) - state.key = state.path[state.path.length - 1] - - let tvals = parent.slice(1) - if (0 === tvals.length) { - state.errs.push('The $ONE validator at field ' + - pathify(state.path, 1, 1) + - ' must have at least one argument.') - return +// This is the "modify" argument to inject. Use this to perform +// generic validation. Runs *after* any special commands. +const _validation = (pval, key, parent, inj) => { + if (NONE === inj) { + return } - - // See if we can find a match. - for (let tval of tvals) { - - // If match, then errs.length = 0 - let terrs = [] - - const vstore = { ...store } - vstore.$TOP = current - const vcurrent = validate(current, tval, vstore, terrs) - setprop(grandparent, grandkey, vcurrent) - - // Accept current value if there was a match - if (0 === terrs.length) { + if (SKIP === pval) { return - } } - - // There was no match. - - const valdesc = tvals - .map((v) => stringify(v)) - .join(', ') - .replace(/`\$([A-Z]+)`/g, (_m, p1) => p1.toLowerCase()) - - state.errs.push(_invalidTypeMsg( - state.path, - (1 < tvals.length ? 'one of ' : '') + valdesc, - typify(current), current, 'V0210')) - } -} - - -// Match exactly one of the specified shapes. -// Syntax: ['`$EXACT`', val0, val1, ...] -const validate_EXACT = ( - state, - _val, - current, - _ref, - _store -) => { - const { mode, parent, key, keyI, path, nodes } = state - - // Only operate in val mode, since parent is a list. - if (S_MVAL === mode) { - if (!islist(parent) || 0 !== keyI) { - state.errs.push('The $EXACT validator at field ' + - pathify(state.path, 1, 1) + - ' must be the first element of an array.') - return + // select needs exact matches + const exact = getprop(inj.meta, S_BEXACT, false) + // Current val to verify. + const cval = getprop(inj.dparent, key) + if (NONE === inj || (!exact && NONE === cval)) { + return } - - state.keyI = state.keys.length - - const grandparent = nodes[nodes.length - 2] - const grandkey = path[path.length - 2] - - // Clean up structure, replacing [$EXACT, ...] with current - setprop(grandparent, grandkey, current) - state.path = state.path.slice(0, state.path.length - 1) - state.key = state.path[state.path.length - 1] - - let tvals = parent.slice(1) - if (0 === tvals.length) { - state.errs.push('The $EXACT validator at field ' + - pathify(state.path, 1, 1) + - ' must have at least one argument.') - return + const ptype = typify(pval) + // Delete any special commands remaining. + if (0 < (T_string & ptype) && pval.includes(S_DS)) { + return } - - // See if we can find an exact value match. - let currentstr = undefined - for (let tval of tvals) { - let exactmatch = tval === current - - if (!exactmatch && isnode(tval)) { - currentstr = undefined === currentstr ? stringify(current) : currentstr - const tvalstr = stringify(tval) - exactmatch = tvalstr === currentstr - } - - if (exactmatch) { + const ctype = typify(cval) + // Type mismatch. + if (ptype !== ctype && NONE !== pval) { + inj.errs.push(_invalidTypeMsg(inj.path, typename(ptype), ctype, cval, 'V0010')) return - } } - - const valdesc = tvals - .map((v) => stringify(v)) - .join(', ') - .replace(/`\$([A-Z]+)`/g, (_m, p1) => p1.toLowerCase()) - - state.errs.push(_invalidTypeMsg( - state.path, - (1 < state.path.length ? '' : 'value ') + - 'exactly equal to ' + (1 === tvals.length ? '' : 'one of ') + valdesc, - typify(current), current, 'V0110')) - } - else { - setprop(parent, key, UNDEF) - } -} - - -// This is the "modify" argument to inject. Use this to perform -// generic validation. Runs *after* any special commands. -const _validation = ( - pval, - key, - parent, - state, - current, - _store -) => { - - if (UNDEF === state) { - return - } - - // Current val to verify. - const cval = getprop(current, key) - - if (UNDEF === cval || UNDEF === state) { - return - } - - // const pval = getprop(parent, key) - const ptype = typify(pval) - - // Delete any special commands remaining. - if (S_string === ptype && pval.includes(S_DS)) { - return - } - - const ctype = typify(cval) - - // Type mismatch. - if (ptype !== ctype && UNDEF !== pval) { - state.errs.push(_invalidTypeMsg(state.path, ptype, ctype, cval, 'V0010')) - return - } - - if (ismap(cval)) { - if (!ismap(pval)) { - state.errs.push(_invalidTypeMsg(state.path, ptype, ctype, cval, 'V0020')) - return + if (ismap(cval)) { + if (!ismap(pval)) { + inj.errs.push(_invalidTypeMsg(inj.path, typename(ptype), ctype, cval, 'V0020')) + return + } + const ckeys = keysof(cval) + const pkeys = keysof(pval) + // Empty spec object {} means object can be open (any keys). + if (0 < size(pkeys) && true !== getprop(pval, '`$OPEN`')) { + const badkeys = [] + for (let ckey of ckeys) { + if (!haskey(pval, ckey)) { + badkeys.push(ckey) + } + } + // Closed object, so reject extra keys not in shape. + if (0 < size(badkeys)) { + const msg = 'Unexpected keys at field ' + pathify(inj.path, 1) + S_VIZ + join(badkeys, ', ') + inj.errs.push(msg) + } + } + else { + // Object is open, so merge in extra keys. + merge([pval, cval]) + if (isnode(pval)) { + delprop(pval, '`$OPEN`') + } + } } - - const ckeys = keysof(cval) - const pkeys = keysof(pval) - - // Empty spec object {} means object can be open (any keys). - if (0 < pkeys.length && true !== getprop(pval, '`$OPEN`')) { - const badkeys = [] - for (let ckey of ckeys) { - if (!haskey(pval, ckey)) { - badkeys.push(ckey) - } - } - - // Closed object, so reject extra keys not in shape. - if (0 < badkeys.length) { - const msg = - 'Unexpected keys at field ' + pathify(state.path, 1) + ': ' + badkeys.join(', ') - state.errs.push(msg) - } + else if (islist(cval)) { + if (!islist(pval)) { + inj.errs.push(_invalidTypeMsg(inj.path, typename(ptype), ctype, cval, 'V0030')) + } + } + else if (exact) { + if (cval !== pval) { + const pathmsg = 1 < size(inj.path) ? 'at field ' + pathify(inj.path, 1) + S_VIZ : S_MT + inj.errs.push('Value ' + pathmsg + cval + + ' should equal ' + pval + S_DT) + } } else { - // Object is open, so merge in extra keys. - merge([pval, cval]) - if (isnode(pval)) { - setprop(pval, '`$OPEN`', UNDEF) - } - } - } - else if (islist(cval)) { - if (!islist(pval)) { - state.errs.push(_invalidTypeMsg(state.path, ptype, ctype, cval, 'V0030')) - } - } - else { - // Spec value was a default, copy over data - setprop(parent, key, cval) - } - - return + // Spec value was a default, copy over data + setprop(parent, key, cval) + } + return } - - - // Validate a data structure against a shape specification. The shape // specification follows the "by example" principle. Plain data in // teh shape is treated as default values that also specify the @@ -1405,219 +1729,557 @@ const _validation = ( // provided to specify required values. Thus shape {a:'`$STRING`'} // validates {a:'A'} but not {a:1}. Empty map or list means the node // is open, and if missing an empty default is inserted. -function validate( - data, // Source data to transform into new data (original not mutated) - spec, // Transform specification; output follows this shape - - extra, // Additional custom checks - - // Optionally modify individual values. - collecterrs, -) { - const errs = null == collecterrs ? [] : collecterrs - - const store = { - // Remove the transform commands. - $DELETE: null, - $COPY: null, - $KEY: null, - $META: null, - $MERGE: null, - $EACH: null, - $PACK: null, - - $STRING: validate_STRING, - $NUMBER: validate_NUMBER, - $BOOLEAN: validate_BOOLEAN, - $OBJECT: validate_OBJECT, - $ARRAY: validate_ARRAY, - $FUNCTION: validate_FUNCTION, - $ANY: validate_ANY, - $CHILD: validate_CHILD, - $ONE: validate_ONE, - $EXACT: validate_EXACT, - - ...(extra || {}), - - // A special top level value to collect errors. - $ERRS: errs, - } - - const out = transform(data, spec, store, _validation) - - const generr = (0 < errs.length && null == collecterrs) - if (generr) { - throw new Error('Invalid data: ' + errs.join(' | ')) - } - - return out +function validate(data, // Source data to transform into new data (original not mutated) +spec, // Transform specification; output follows this shape +injdef) { + const extra = injdef?.extra + const collect = null != injdef?.errs + const errs = injdef?.errs || [] + const store = merge([ + { + // Remove the transform commands. + $DELETE: null, + $COPY: null, + $KEY: null, + $META: null, + $MERGE: null, + $EACH: null, + $PACK: null, + $STRING: validate_STRING, + $NUMBER: validate_TYPE, + $INTEGER: validate_TYPE, + $DECIMAL: validate_TYPE, + $BOOLEAN: validate_TYPE, + $NULL: validate_TYPE, + $NIL: validate_TYPE, + $MAP: validate_TYPE, + $LIST: validate_TYPE, + $FUNCTION: validate_TYPE, + $INSTANCE: validate_TYPE, + $ANY: validate_ANY, + $CHILD: validate_CHILD, + $ONE: validate_ONE, + $EXACT: validate_EXACT, + }, + getdef(extra, {}), + // A special top level value to collect errors. + // NOTE: collecterrs parameter always wins. + { + $ERRS: errs, + } + ], 1) + let meta = getprop(injdef, 'meta', {}) + setprop(meta, S_BEXACT, getprop(meta, S_BEXACT, false)) + const out = transform(data, spec, { + meta, + extra: store, + modify: _validation, + handler: _validatehandler, + errs, + }) + const generr = (0 < size(errs) && !collect) + if (generr) { + throw new Error(join(errs, ' | ')) + } + return out } - - -// Internal utilities -// ================== - - -// Set state.key property of state.parent node, ensuring reference consistency -// when needed by implementation language. -function _setparentprop(state, val) { - setprop(state.parent, state.key, val) +const select_AND = (inj, _val, _ref, store) => { + if (M_KEYPRE === inj.mode) { + const terms = getprop(inj.parent, inj.key) + const ppath = slice(inj.path, -1) + const point = getpath(store, ppath) + const vstore = merge([{}, store], 1) + vstore.$TOP = point + for (let term of terms) { + let terrs = [] + validate(point, term, { + extra: vstore, + errs: terrs, + meta: inj.meta, + }) + if (0 != size(terrs)) { + inj.errs.push('AND:' + pathify(ppath) + S_VIZ + stringify(point) + ' fail:' + stringify(terms)) + } + } + const gkey = getelem(inj.path, -2) + const gp = getelem(inj.nodes, -2) + setprop(gp, gkey, point) + } } - - -// Update all references to target in state.nodes. -function _updateAncestors(_state, target, tkey, tval) { - // SetProp is sufficient in JavaScript as target reference remains consistent even for lists. - setprop(target, tkey, tval) +const select_OR = (inj, _val, _ref, store) => { + if (M_KEYPRE === inj.mode) { + const terms = getprop(inj.parent, inj.key) + const ppath = slice(inj.path, -1) + const point = getpath(store, ppath) + const vstore = merge([{}, store], 1) + vstore.$TOP = point + for (let term of terms) { + let terrs = [] + validate(point, term, { + extra: vstore, + errs: terrs, + meta: inj.meta, + }) + if (0 === size(terrs)) { + const gkey = getelem(inj.path, -2) + const gp = getelem(inj.nodes, -2) + setprop(gp, gkey, point) + return + } + } + inj.errs.push('OR:' + pathify(ppath) + S_VIZ + stringify(point) + ' fail:' + stringify(terms)) + } } - - +const select_NOT = (inj, _val, _ref, store) => { + if (M_KEYPRE === inj.mode) { + const term = getprop(inj.parent, inj.key) + const ppath = slice(inj.path, -1) + const point = getpath(store, ppath) + const vstore = merge([{}, store], 1) + vstore.$TOP = point + let terrs = [] + validate(point, term, { + extra: vstore, + errs: terrs, + meta: inj.meta, + }) + if (0 == size(terrs)) { + inj.errs.push('NOT:' + pathify(ppath) + S_VIZ + stringify(point) + ' fail:' + stringify(term)) + } + const gkey = getelem(inj.path, -2) + const gp = getelem(inj.nodes, -2) + setprop(gp, gkey, point) + } +} +const select_CMP = (inj, _val, ref, store) => { + if (M_KEYPRE === inj.mode) { + const term = getprop(inj.parent, inj.key) + // const src = getprop(store, inj.base, store) + const gkey = getelem(inj.path, -2) + // const tval = getprop(src, gkey) + const ppath = slice(inj.path, -1) + const point = getpath(store, ppath) + let pass = false + if ('$GT' === ref && point > term) { + pass = true + } + else if ('$LT' === ref && point < term) { + pass = true + } + else if ('$GTE' === ref && point >= term) { + pass = true + } + else if ('$LTE' === ref && point <= term) { + pass = true + } + else if ('$LIKE' === ref && stringify(point).match(RegExp(term))) { + pass = true + } + if (pass) { + // Update spec to match found value so that _validate does not complain. + const gp = getelem(inj.nodes, -2) + setprop(gp, gkey, point) + } + else { + inj.errs.push('CMP: ' + pathify(ppath) + S_VIZ + stringify(point) + + ' fail:' + ref + ' ' + stringify(term)) + } + } + return NONE +} +// Select children from a top-level object that match a MongoDB-style query. +// Supports $and, $or, and equality comparisons. +// For arrays, children are elements; for objects, children are values. +// TODO: swap arg order for consistency +function select(children, query) { + if (!isnode(children)) { + return [] + } + if (ismap(children)) { + children = items(children, n => { + setprop(n[1], S_DKEY, n[0]) + return n[1] + }) + } + else { + children = items(children, (n) => (setprop(n[1], S_DKEY, +n[0]), n[1])) + } + const results = [] + const injdef = { + errs: [], + meta: { [S_BEXACT]: true }, + extra: { + $AND: select_AND, + $OR: select_OR, + $NOT: select_NOT, + $GT: select_CMP, + $LT: select_CMP, + $GTE: select_CMP, + $LTE: select_CMP, + $LIKE: select_CMP, + } + } + const q = clone(query) + walk(q, (_k, v) => { + if (ismap(v)) { + setprop(v, '`$OPEN`', getprop(v, '`$OPEN`', true)) + } + return v + }) + for (const child of children) { + injdef.errs = [] + validate(child, clone(q), injdef) + if (0 === size(injdef.errs)) { + results.push(child) + } + } + return results +} +// Injection state used for recursive injection into JSON - like data structures. +class Injection { + constructor(val, parent) { + this.val = val + this.parent = parent + this.errs = [] + this.dparent = NONE + this.dpath = [S_DTOP] + this.mode = M_VAL + this.full = false + this.keyI = 0 + this.keys = [S_DTOP] + this.key = S_DTOP + this.path = [S_DTOP] + this.nodes = [parent] + this.handler = _injecthandler + this.base = S_DTOP + this.meta = {} + } + toString(prefix) { + return 'INJ' + (null == prefix ? '' : S_FS + prefix) + S_CN + + pad(pathify(this.path, 1)) + + MODENAME[this.mode] + (this.full ? '/full' : '') + S_CN + + 'key=' + this.keyI + S_FS + this.key + S_FS + S_OS + this.keys + S_CS + + ' p=' + stringify(this.parent, -1, 1) + + ' m=' + stringify(this.meta, -1, 1) + + ' d/' + pathify(this.dpath, 1) + '=' + stringify(this.dparent, -1, 1) + + ' r=' + stringify(this.nodes[0]?.[S_DTOP], -1, 1) + } + descend() { + this.meta.__d++ + const parentkey = getelem(this.path, -2) + // Resolve current node in store for local paths. + if (NONE === this.dparent) { + // Even if there's no data, dpath should continue to match path, so that + // relative paths work properly. + if (1 < size(this.dpath)) { + this.dpath = flatten([this.dpath, parentkey]) + } + } + else { + // this.dparent is the containing node of the current store value. + if (null != parentkey) { + this.dparent = getprop(this.dparent, parentkey) + let lastpart = getelem(this.dpath, -1) + if (lastpart === '$:' + parentkey) { + this.dpath = slice(this.dpath, -1) + } + else { + this.dpath = flatten([this.dpath, parentkey]) + } + } + } + // TODO: is this needed? + return this.dparent + } + child(keyI, keys) { + const key = strkey(keys[keyI]) + const val = this.val + const cinj = new Injection(getprop(val, key), val) + cinj.keyI = keyI + cinj.keys = keys + cinj.key = key + cinj.path = flatten([getdef(this.path, []), key]) + cinj.nodes = flatten([getdef(this.nodes, []), [val]]) + cinj.mode = this.mode + cinj.handler = this.handler + cinj.modify = this.modify + cinj.base = this.base + cinj.meta = this.meta + cinj.errs = this.errs + cinj.prior = this + cinj.dpath = flatten([this.dpath]) + cinj.dparent = this.dparent + return cinj + } + setval(val, ancestor) { + let parent = NONE + if (null == ancestor || ancestor < 2) { + parent = NONE === val ? + this.parent = delprop(this.parent, this.key) : + setprop(this.parent, this.key, val) + } + else { + const aval = getelem(this.nodes, 0 - ancestor) + const akey = getelem(this.path, 0 - ancestor) + parent = NONE === val ? + delprop(aval, akey) : + setprop(aval, akey, val) + } + // console.log('SETVAL', val, this.key, this.parent) + return parent + } +} +// Internal utilities +// ================== +// // Update all references to target in inj.nodes. +// function _updateAncestors(_inj: Injection, target: any, tkey: any, tval: any) { +// // SetProp is sufficient in TypeScript as target reference remains consistent even for lists. +// setprop(target, tkey, tval) +// } // Build a type validation error message. function _invalidTypeMsg(path, needtype, vt, v, _whence) { - let vs = null == v ? 'no value' : stringify(v) - - return 'Expected ' + - (1 < path.length ? ('field ' + pathify(path, 1) + ' to be ') : '') + - needtype + ', but found ' + - (null != v ? vt + ': ' : '') + vs + - - // Uncomment to help debug validation errors. - // (null == _whence ? '' : ' [' + _whence + ']') + - - '.' + let vs = null == v ? 'no value' : stringify(v) + return 'Expected ' + + (1 < size(path) ? ('field ' + pathify(path, 1) + ' to be ') : '') + + needtype + ', but found ' + + (null != v ? typename(vt) + S_VIZ : '') + vs + + // Uncomment to help debug validation errors. + // ' [' + _whence + ']' + + '.' } - - // Default inject handler for transforms. If the path resolves to a function, -// call the function passing the injection state. This is how transforms operate. -const _injecthandler = ( - state, - val, - current, - ref, - store -) => { - let out = val - const iscmd = isfunc(val) && (UNDEF === ref || ref.startsWith(S_DS)) - - // Only call val function if it is a special command ($NAME format). - if (iscmd) { - out = val(state, val, current, ref, store) - } - - // Update parent with value. Ensures references remain in node tree. - else if (S_MVAL === state.mode && state.full) { - _setparentprop(state, val) - } - - return out +// call the function passing the injection inj. This is how transforms operate. +const _injecthandler = (inj, val, ref, store) => { + let out = val + const iscmd = isfunc(val) && (NONE === ref || ref.startsWith(S_DS)) + // Only call val function if it is a special command ($NAME format). + // TODO: OR if meta.'$CALL' + if (iscmd) { + out = val(inj, val, ref, store) + } + // Update parent with value. Ensures references remain in node tree. + else if (M_VAL === inj.mode && inj.full) { + inj.setval(val) + } + return out +} +const _validatehandler = (inj, val, ref, store) => { + let out = val + const m = ref.match(R_META_PATH) + const ismetapath = null != m + if (ismetapath) { + if ('=' === m[2]) { + inj.setval([S_BEXACT, val]) + } + else { + inj.setval(val) + } + inj.keyI = -1 + out = SKIP + } + else { + out = _injecthandler(inj, val, ref, store) + } + return out } - - // Inject values from a data store into a string. Not a public utility - used by // `inject`. Inject are marked with `path` where path is resolved // with getpath against the store or current (if defined) // arguments. See `getpath`. Custom injection handling can be -// provided by state.handler (this is used for transform functions). +// provided by inj.handler (this is used for transform functions). // The path can also have the special syntax $NAME999 where NAME is // upper case letters only, and 999 is any digits, which are // discarded. This syntax specifies the name of a transform, and // optionally allows transforms to be ordered by alphanumeric sorting. -function _injectstr( - val, - store, - current, - state -) { - // Can't inject into non-strings - if (S_string !== typeof val || S_MT === val) { - return S_MT - } - - let out = val - - // Pattern examples: "`a.b.c`", "`$NAME`", "`$NAME1`" - const m = val.match(/^`(\$[A-Z]+|[^`]+)[0-9]*`$/) - - // Full string of the val is an injection. - if (m) { - if (null != state) { - state.full = true +function _injectstr(val, store, inj) { + // Can't inject into non-strings + if (S_string !== typeof val || S_MT === val) { + return S_MT } - let pathref = m[1] - - // Special escapes inside injection. - pathref = - 3 < pathref.length ? pathref.replace(/\$BT/g, S_BT).replace(/\$DS/g, S_DS) : pathref - - // Get the extracted path reference. - out = getpath(pathref, store, current, state) - } - - else { - // Check for injections within the string. - const partial = (_m, ref) => { - - // Special escapes inside injection. - ref = 3 < ref.length ? ref.replace(/\$BT/g, S_BT).replace(/\$DS/g, S_DS) : ref - if (state) { - state.full = false - } - const found = getpath(ref, store, current, state) - - // Ensure inject value is a string. - return UNDEF === found ? S_MT : S_string === typeof found ? found : JSON.stringify(found) + let out = val + // Pattern examples: "`a.b.c`", "`$NAME`", "`$NAME1`" + const m = val.match(R_INJECTION_FULL) + // Full string of the val is an injection. + if (m) { + if (null != inj) { + inj.full = true + } + let pathref = m[1] + // Special escapes inside injection. + if (3 < size(pathref)) { + pathref = pathref.replace(R_BT_ESCAPE, S_BT).replace(R_DS_ESCAPE, S_DS) + } + // Get the extracted path reference. + out = getpath(store, pathref, inj) } - - out = val.replace(/`([^`]+)`/g, partial) - - // Also call the state handler on the entire string, providing the - // option for custom injection. - if (null != state && isfunc(state.handler)) { - state.full = true - out = state.handler(state, out, current, val, store) + else { + // Check for injections within the string. + const partial = (_m, ref) => { + // Special escapes inside injection. + if (3 < size(ref)) { + ref = ref.replace(R_BT_ESCAPE, S_BT).replace(R_DS_ESCAPE, S_DS) + } + if (inj) { + inj.full = false + } + const found = getpath(store, ref, inj) + // Ensure inject value is a string. + return NONE === found ? S_MT : S_string === typeof found ? found : JSON.stringify(found) + } + out = val.replace(R_INJECTION_PARTIAL, partial) + // Also call the inj handler on the entire string, providing the + // option for custom injection. + if (null != inj && isfunc(inj.handler)) { + inj.full = true + out = inj.handler(inj, out, val, store) + } } - } - - return out + return out +} +// Handler Utilities +// ================= +const MODENAME = { + [M_VAL]: 'val', + [M_KEYPRE]: 'key:pre', + [M_KEYPOST]: 'key:post', +} +const PLACEMENT = { + [M_VAL]: 'value', + [M_KEYPRE]: S_key, + [M_KEYPOST]: S_key, +} +function checkPlacement(modes, ijname, parentTypes, inj) { + if (0 === (modes & inj.mode)) { + inj.errs.push('$' + ijname + ': invalid placement as ' + PLACEMENT[inj.mode] + + ', expected: ' + join(items([M_KEYPRE, M_KEYPOST, M_VAL].filter(m => modes & m), (n) => PLACEMENT[n[1]]), ',') + '.') + return false + } + if (!isempty(parentTypes)) { + const ptype = typify(inj.parent) + if (0 === (parentTypes & ptype)) { + inj.errs.push('$' + ijname + ': invalid placement in parent ' + typename(ptype) + + ', expected: ' + typename(parentTypes) + '.') + return false + } + } + return true +} +// function injectorArgs(argTypes: number[], inj: Injection): any { +function injectorArgs(argTypes, args) { + const numargs = size(argTypes) + const found = new Array(1 + numargs) + found[0] = NONE + for (let argI = 0; argI < numargs; argI++) { + // const arg = inj.parent[1 + argI] + const arg = args[argI] + const argType = typify(arg) + if (0 === (argTypes[argI] & argType)) { + found[0] = 'invalid argument: ' + stringify(arg, 22) + + ' (' + typename(argType) + ' at position ' + (1 + argI) + + ') is not of type: ' + typename(argTypes[argI]) + '.' + break + } + found[1 + argI] = arg + } + return found +} +function injectChild(child, store, inj) { + let cinj = inj + // Replace ['`$FORMAT`',...] with child + if (null != inj.prior) { + if (null != inj.prior.prior) { + cinj = inj.prior.prior.child(inj.prior.keyI, inj.prior.keys) + cinj.val = child + setprop(cinj.parent, inj.prior.key, child) + } + else { + cinj = inj.prior.child(inj.keyI, inj.keys) + cinj.val = child + setprop(cinj.parent, inj.key, child) + } + } + // console.log('FORMAT-INJECT-CHILD', child) + inject(child, store, cinj) + return cinj } - - class StructUtility { - clone = clone - escre = escre - escurl = escurl - getpath = getpath - getprop = getprop - haskey = haskey - inject = inject - isempty = isempty - isfunc = isfunc - iskey = iskey - islist = islist - ismap = ismap - isnode = isnode - items = items - joinurl = joinurl - keysof = keysof - merge = merge - pathify = pathify - setprop = setprop - strkey = strkey - stringify = stringify - transform = transform - typify = typify - validate = validate - walk = walk + constructor() { + this.clone = clone + this.delprop = delprop + this.escre = escre + this.escurl = escurl + this.filter = filter + this.flatten = flatten + this.getdef = getdef + this.getelem = getelem + this.getpath = getpath + this.getprop = getprop + this.haskey = haskey + this.inject = inject + this.isempty = isempty + this.isfunc = isfunc + this.iskey = iskey + this.islist = islist + this.ismap = ismap + this.isnode = isnode + this.items = items + this.join = join + this.jsonify = jsonify + this.keysof = keysof + this.merge = merge + this.pad = pad + this.pathify = pathify + this.select = select + this.setpath = setpath + this.setprop = setprop + this.size = size + this.slice = slice + this.strkey = strkey + this.stringify = stringify + this.transform = transform + this.typify = typify + this.typename = typename + this.validate = validate + this.walk = walk + this.SKIP = SKIP + this.DELETE = DELETE + this.jm = jm + this.jt = jt + this.tn = typename + this.T_any = T_any + this.T_noval = T_noval + this.T_boolean = T_boolean + this.T_decimal = T_decimal + this.T_integer = T_integer + this.T_number = T_number + this.T_string = T_string + this.T_function = T_function + this.T_symbol = T_symbol + this.T_null = T_null + this.T_list = T_list + this.T_map = T_map + this.T_instance = T_instance + this.T_scalar = T_scalar + this.T_node = T_node + this.checkPlacement = checkPlacement + this.injectorArgs = injectorArgs + this.injectChild = injectChild + } } module.exports = { StructUtility, + Injection, clone, + delprop, escre, escurl, + filter, + flatten, + getdef, + getelem, getpath, getprop, haskey, @@ -1629,16 +2291,54 @@ module.exports = { ismap, isnode, items, - joinurl, + join, + jsonify, keysof, merge, + pad, pathify, + select, + setpath, setprop, + size, + slice, strkey, stringify, transform, typify, + typename, validate, walk, + SKIP, + DELETE, + + jm, + jt, + + T_any, + T_noval, + T_boolean, + T_decimal, + T_integer, + T_number, + T_string, + T_function, + T_symbol, + T_null, + T_list, + T_map, + T_instance, + T_scalar, + T_node, + + M_KEYPRE, + M_KEYPOST, + M_VAL, + + MODENAME, + + checkPlacement, + injectorArgs, + injectChild, } diff --git a/js/test/runner.js b/js/test/runner.js index 3c3f1d2c..03906c05 100644 --- a/js/test/runner.js +++ b/js/test/runner.js @@ -46,7 +46,7 @@ async function makeRunner(testfile, client) { res = fixJSON(res, flags) entry.res = res - checkResult(entry, res, structUtils) + checkResult(entry, args, res, structUtils) } catch (err) { handleError(entry, err, structUtils) @@ -125,11 +125,16 @@ function resolveEntry(entry, flags) { } -function checkResult(entry, res, structUtils) { +function checkResult(entry, args, res, structUtils) { let matched = false + if (entry.err) { + return fail('Expected error did not occur: ' + entry.err + + '\n\nENTRY: ' + JSON.stringify(entry, null, 2)) + } + if (entry.match) { - const result = { in: entry.in, out: entry.res, ctx: entry.ctx } + const result = { in: entry.in, args, out: entry.res, ctx: entry.ctx } match( entry.match, result, @@ -254,7 +259,7 @@ function match( structUtils.walk(check, (_key, val, _parent, path) => { if(!structUtils.isnode(val)) { - let baseval = structUtils.getpath(path, base) + let baseval = structUtils.getpath(base, path) if (baseval === val) { return val diff --git a/js/test/struct.test.js b/js/test/struct.test.js index 2af7488c..04f12c4c 100644 --- a/js/test/struct.test.js +++ b/js/test/struct.test.js @@ -3,7 +3,7 @@ // RUN-SOME: npm run test-some --pattern=getpath const { test, describe } = require('node:test') -const { equal, deepEqual } = require('node:assert') +const assert = require('node:assert') const { makeRunner, @@ -15,86 +15,65 @@ const { SDK } = require('./sdk.js') const TEST_JSON_FILE = '../../build/test/test.json' +const { equal, deepEqual } = assert + // NOTE: tests are (mostly) in order of increasing dependence. describe('struct', async () => { const runner = await makeRunner(TEST_JSON_FILE, await SDK.test()) - + const { spec, runset, runsetflags, client } = await runner('struct') - const { - clone, - escre, - escurl, - getpath, - getprop, - - haskey, - inject, - isempty, - isfunc, - iskey, - - islist, - ismap, - isnode, - items, - joinurl, - - keysof, - merge, - pathify, - setprop, - strkey, - - stringify, - transform, - typify, - validate, - walk, - - } = client.utility().struct - - const minorSpec = spec.minor - const walkSpec = spec.walk - const mergeSpec = spec.merge - const getpathSpec = spec.getpath - const injectSpec = spec.inject - const transformSpec = spec.transform - const validateSpec = spec.validate + const struct = client.utility().struct test('exists', () => { - equal('function', typeof clone) - equal('function', typeof escre) - equal('function', typeof escurl) - equal('function', typeof getprop) - equal('function', typeof getpath) - - equal('function', typeof haskey) - equal('function', typeof inject) - equal('function', typeof isempty) - equal('function', typeof isfunc) - equal('function', typeof iskey) - - equal('function', typeof islist) - equal('function', typeof ismap) - equal('function', typeof isnode) - equal('function', typeof items) - equal('function', typeof joinurl) - - equal('function', typeof keysof) - equal('function', typeof merge) - equal('function', typeof pathify) - equal('function', typeof setprop) - equal('function', typeof strkey) - - equal('function', typeof stringify) - equal('function', typeof transform) - equal('function', typeof typify) - equal('function', typeof validate) - equal('function', typeof walk) + const s = struct + + equal('function', typeof s.clone) + equal('function', typeof s.delprop) + equal('function', typeof s.escre) + equal('function', typeof s.escurl) + equal('function', typeof s.filter) + + equal('function', typeof s.flatten) + equal('function', typeof s.getelem) + equal('function', typeof s.getprop) + + equal('function', typeof s.getpath) + equal('function', typeof s.haskey) + equal('function', typeof s.inject) + equal('function', typeof s.isempty) + equal('function', typeof s.isfunc) + + equal('function', typeof s.iskey) + equal('function', typeof s.islist) + equal('function', typeof s.ismap) + equal('function', typeof s.isnode) + equal('function', typeof s.items) + + equal('function', typeof s.join) + equal('function', typeof s.jsonify) + equal('function', typeof s.keysof) + equal('function', typeof s.merge) + equal('function', typeof s.pad) + equal('function', typeof s.pathify) + + equal('function', typeof s.select) + equal('function', typeof s.setpath) + equal('function', typeof s.size) + equal('function', typeof s.slice) + equal('function', typeof s.setprop) + + equal('function', typeof s.strkey) + equal('function', typeof s.stringify) + equal('function', typeof s.transform) + equal('function', typeof s.typify) + equal('function', typeof s.typename) + + equal('function', typeof s.validate) + equal('function', typeof s.walk) }) @@ -102,90 +81,148 @@ describe('struct', async () => { // =========== test('minor-isnode', async () => { - await runset(minorSpec.isnode, isnode) + await runset(spec.minor.isnode, struct.isnode) }) - test('minor-ismap', async () => { - await runset(minorSpec.ismap, ismap) + await runset(spec.minor.ismap, struct.ismap) }) - test('minor-islist', async () => { - await runset(minorSpec.islist, islist) + await runset(spec.minor.islist, struct.islist) }) - test('minor-iskey', async () => { - await runsetflags(minorSpec.iskey, { null: false }, iskey) + await runsetflags(spec.minor.iskey, { null: false }, struct.iskey) }) - test('minor-strkey', async () => { - await runsetflags(minorSpec.strkey, { null: false }, strkey) + await runsetflags(spec.minor.strkey, { null: false }, struct.strkey) }) - test('minor-isempty', async () => { - await runsetflags(minorSpec.isempty, { null: false }, isempty) + await runsetflags(spec.minor.isempty, { null: false }, struct.isempty) }) - test('minor-isfunc', async () => { - await runset(minorSpec.isfunc, isfunc) + const { isfunc } = struct + await runset(spec.minor.isfunc, isfunc) function f0() { return null } equal(isfunc(f0), true) equal(isfunc(() => null), true) }) - test('minor-clone', async () => { - await runsetflags(minorSpec.clone, { null: false }, clone) + await runsetflags(spec.minor.clone, { null: false }, struct.clone) + }) + + test('minor-edge-clone', async () => { + const { clone } = struct + const f0 = () => null deepEqual({ a: f0 }, clone({ a: f0 })) + + const x = { y: 1 } + let xc = clone(x) + deepEqual(x, xc) + assert(x !== xc) + + class A { constructor() { this.x = 1 } } + const a = new A() + let ac = clone(a) + deepEqual(a, ac) + assert(a === ac) + equal(a.constructor.name, ac.constructor.name) }) + test('minor-filter', async () => { + const checkmap = { + gt3: (n) => n[1] > 3, + lt3: (n) => n[1] < 3, + } + await runset(spec.minor.filter, (vin) => struct.filter(vin.val, checkmap[vin.check])) + }) - test('minor-escre', async () => { - await runset(minorSpec.escre, escre) + test('minor-flatten', async () => { + await runset(spec.minor.flatten, (vin) => struct.flatten(vin.val, vin.depth)) }) + test('minor-escre', async () => { + await runset(spec.minor.escre, struct.escre) + }) test('minor-escurl', async () => { - await runset(minorSpec.escurl, escurl) + await runset(spec.minor.escurl, struct.escurl) }) - test('minor-stringify', async () => { - await runset(minorSpec.stringify, (vin) => - stringify((NULLMARK === vin.val ? "null" : vin.val), vin.max)) + await runset(spec.minor.stringify, (vin) => + struct.stringify((NULLMARK === vin.val ? "null" : vin.val), vin.max)) }) + test('minor-edge-stringify', async () => { + const { stringify } = struct + const a = {} + a.a = a + equal(stringify(a), '__STRINGIFY_FAILED__') + + equal(stringify({ a: [9] }, -1, true), + '\x1B[38;5;81m\x1B[38;5;118m{\x1B[38;5;118ma\x1B[38;5;118m:' + + '\x1B[38;5;213m[\x1B[38;5;213m9\x1B[38;5;213m]\x1B[38;5;118m}\x1B[0m') + }) + + test('minor-jsonify', async () => { + await runsetflags(spec.minor.jsonify, { null: false }, + (vin) => struct.jsonify(vin.val, vin.flags)) + }) + + test('minor-edge-jsonify', async () => { + const { jsonify } = struct + equal(jsonify(() => 1), 'null') + }) test('minor-pathify', async () => { await runsetflags( - minorSpec.pathify, { null: true }, + spec.minor.pathify, { null: true }, (vin) => { let path = NULLMARK == vin.path ? undefined : vin.path - let pathstr = pathify(path, vin.from).replace('__NULL__.', '') + let pathstr = struct.pathify(path, vin.from).replace('__NULL__.', '') pathstr = NULLMARK === vin.path ? pathstr.replace('>', ':null>') : pathstr return pathstr }) }) - test('minor-items', async () => { - await runset(minorSpec.items, items) + await runset(spec.minor.items, struct.items) + }) + + test('minor-edge-items', async () => { + const { items } = struct + const a0 = [11, 22, 33] + a0.x = 1 + deepEqual(items(a0), [['0', 11], ['1', 22], ['2', 33]]) }) + test('minor-getelem', async () => { + const { getelem } = struct + await runsetflags(spec.minor.getelem, { null: false }, (vin) => + null == vin.alt ? getelem(vin.val, vin.key) : getelem(vin.val, vin.key, vin.alt)) + }) - test('minor-getprop', async () => { - await runsetflags(minorSpec.getprop, { null: false }, (vin) => - null == vin.alt ? getprop(vin.val, vin.key) : getprop(vin.val, vin.key, vin.alt)) + test('minor-edge-getelem', async () => { + const { getelem } = struct + equal(getelem([], 1, () => 2), 2) }) + test('minor-getprop', async () => { + const { getprop } = struct + await runsetflags(spec.minor.getprop, { null: false }, (vin) => + undefined === vin.alt ? getprop(vin.val, vin.key) : getprop(vin.val, vin.key, vin.alt)) + }) test('minor-edge-getprop', async () => { + const { getprop } = struct + let strarr = ['a', 'b', 'c', 'd', 'e'] deepEqual(getprop(strarr, 2), 'c') deepEqual(getprop(strarr, '2'), 'c') @@ -195,14 +232,14 @@ describe('struct', async () => { deepEqual(getprop(intarr, '2'), 5) }) - test('minor-setprop', async () => { - await runsetflags(minorSpec.setprop, { null: false }, (vin) => - setprop(vin.parent, vin.key, vin.val)) + await runset(spec.minor.setprop, (vin) => + struct.setprop(vin.parent, vin.key, vin.val)) }) - test('minor-edge-setprop', async () => { + const { setprop } = struct + let strarr0 = ['a', 'b', 'c', 'd', 'e'] let strarr1 = ['a', 'b', 'c', 'd', 'e'] deepEqual(setprop(strarr0, 2, 'C'), ['a', 'b', 'C', 'd', 'e']) @@ -214,25 +251,94 @@ describe('struct', async () => { deepEqual(setprop(intarr1, '2', 555), [2, 3, 555, 7, 11]) }) + test('minor-delprop', async () => { + await runset(spec.minor.delprop, (vin) => + struct.delprop(vin.parent, vin.key)) + }) - test('minor-haskey', async () => { - await runsetflags(minorSpec.haskey, { null: false }, (vin) => - haskey(vin.src, vin.key)) + test('minor-edge-delprop', async () => { + const { delprop } = struct + + let strarr0 = ['a', 'b', 'c', 'd', 'e'] + let strarr1 = ['a', 'b', 'c', 'd', 'e'] + deepEqual(delprop(strarr0, 2), ['a', 'b', 'd', 'e']) + deepEqual(delprop(strarr1, '2'), ['a', 'b', 'd', 'e']) + + let intarr0 = [2, 3, 5, 7, 11] + let intarr1 = [2, 3, 5, 7, 11] + deepEqual(delprop(intarr0, 2), [2, 3, 7, 11]) + deepEqual(delprop(intarr1, '2'), [2, 3, 7, 11]) }) + test('minor-haskey', async () => { + await runsetflags(spec.minor.haskey, { null: false }, (vin) => + struct.haskey(vin.src, vin.key)) + }) test('minor-keysof', async () => { - await runset(minorSpec.keysof, keysof) + await runset(spec.minor.keysof, struct.keysof) }) + test('minor-edge-keysof', async () => { + const { keysof } = struct + const a0 = [11, 22, 33] + a0.x = 1 + deepEqual(keysof(a0), [0, 1, 2]) + }) - test('minor-joinurl', async () => { - await runsetflags(minorSpec.joinurl, { null: false }, joinurl) + test('minor-join', async () => { + await runsetflags(spec.minor.join, { null: false }, + (vin) => struct.join(vin.val, vin.sep, vin.url)) }) + test('minor-typename', async () => { + await runset(spec.minor.typename, struct.typename) + }) test('minor-typify', async () => { - await runsetflags(minorSpec.typify, { null: false }, typify) + await runsetflags(spec.minor.typify, { null: false }, struct.typify) + }) + + test('minor-edge-typify', async () => { + const { + typify, T_noval, T_scalar, T_function, T_symbol, T_any, T_node, T_instance, T_null + } = struct + class X { } + const x = new X() + equal(typify(), T_noval) + equal(typify(undefined), T_noval) + equal(typify(NaN), T_noval) + equal(typify(null), T_scalar | T_null) + equal(typify(() => null), T_scalar | T_function) + equal(typify(Symbol('S')), T_scalar | T_symbol) + equal(typify(BigInt(1)), T_any) + equal(typify(x), T_node | T_instance) + }) + + test('minor-size', async () => { + await runsetflags(spec.minor.size, { null: false }, struct.size) + }) + + test('minor-slice', async () => { + await runsetflags(spec.minor.slice, { null: false }, + (vin) => struct.slice(vin.val, vin.start, vin.end)) + }) + + test('minor-pad', async () => { + await runsetflags(spec.minor.pad, { null: false }, + (vin) => struct.pad(vin.val, vin.pad, vin.char)) + }) + + test('minor-setpath', async () => { + await runsetflags(spec.minor.setpath, { null: false }, + (vin) => struct.setpath(vin.store, vin.path, vin.val)) + }) + + test('minor-edge-setpath', async () => { + const { setpath, DELETE } = struct + const x = { y: { z: 1, q: 2 } } + deepEqual(setpath(x, 'y.q', DELETE), { z: 1 }) + deepEqual(x, { y: { z: 1 } }) }) @@ -240,9 +346,11 @@ describe('struct', async () => { // ========== test('walk-log', async () => { - const test = clone(walkSpec.log) + const { clone, stringify, pathify, walk } = struct - const log = [] + const test = clone(spec.walk.log) + + let log = [] function walklog(key, val, parent, path) { log.push('k=' + stringify(key) + @@ -252,8 +360,16 @@ describe('struct', async () => { return val } + walk(test.in, undefined, walklog) + deepEqual(log, test.out.after) + + log = [] walk(test.in, walklog) - deepEqual(log, test.out) + deepEqual(log, test.out.before) + + log = [] + walk(test.in, walklog, walklog) + deepEqual(log, test.out.both) }) @@ -262,7 +378,60 @@ describe('struct', async () => { return 'string' === typeof val ? val + '~' + path.join('.') : val } - await runset(walkSpec.basic, (vin) => walk(vin, walkpath)) + await runset(spec.walk.basic, (vin) => struct.walk(vin, walkpath)) + }) + + + test('walk-depth', async () => { + await runsetflags(spec.walk.depth, { null: false }, + (vin) => { + let top = undefined + let cur = undefined + function copy(key, val, _parent, _path) { + if (undefined === key || struct.isnode(val)) { + let child = struct.islist(val) ? [] : {} + if (undefined === key) { + top = cur = child + } + else { + cur = cur[key] = child + } + } + else { + cur[key] = val + } + return val + } + struct.walk(vin.src, copy, undefined, vin.maxdepth) + return top + }) + }) + + + test('walk-copy', async () => { + const { walk, isnode, ismap, islist, size, setprop } = struct + + let cur + function walkcopy(key, val, _parent, path) { + if (undefined === key) { + cur = [] + cur[0] = ismap(val) ? {} : islist(val) ? [] : val + return val + } + + let v = val + let i = size(path) + + if (isnode(v)) { + v = cur[i] = ismap(v) ? {} : [] + } + + setprop(cur[i - 1], key, v) + + return val + } + + await runset(spec.walk.copy, (vin) => (walk(vin, walkcopy), cur[0])) }) @@ -270,38 +439,67 @@ describe('struct', async () => { // =========== test('merge-basic', async () => { - const test = clone(mergeSpec.basic) + const { clone, merge } = struct + const test = clone(spec.merge.basic) deepEqual(merge(test.in), test.out) }) - test('merge-cases', async () => { - await runset(mergeSpec.cases, merge) + await runset(spec.merge.cases, struct.merge) }) - test('merge-array', async () => { - await runset(mergeSpec.array, merge) + await runset(spec.merge.array, struct.merge) }) - test('merge-integrity', async () => { - await runset(mergeSpec.integrity, merge) + await runset(spec.merge.integrity, struct.merge) }) + test('merge-depth', async () => { + await runset(spec.merge.depth, (vin) => struct.merge(vin.val, vin.depth)) + }) test('merge-special', async () => { + const { merge } = struct const f0 = () => null deepEqual(merge([f0]), f0) deepEqual(merge([null, f0]), f0) - deepEqual(merge([[f0]]), [f0]) deepEqual(merge([{ a: f0 }]), { a: f0 }) + deepEqual(merge([[f0]]), [f0]) deepEqual(merge([{ a: { b: f0 } }]), { a: { b: f0 } }) // JavaScript only deepEqual(merge([{ a: global.fetch }]), { a: global.fetch }) deepEqual(merge([[global.fetch]]), [global.fetch]) deepEqual(merge([{ a: { b: global.fetch } }]), { a: { b: global.fetch } }) + + class Bar { constructor() { this.x = 1 } } + const b0 = new Bar() + + equal(merge([{ x: 10 }, b0]), b0) + equal(b0.x, 1) + equal(b0 instanceof Bar, true) + + deepEqual(merge([{ a: b0 }, { a: { x: 11 } }]), { a: { x: 11 } }) + equal(b0.x, 1) + equal(b0 instanceof Bar, true) + + deepEqual(merge([b0, { x: 20 }]), { x: 20 }) + equal(b0.x, 1) + equal(b0 instanceof Bar, true) + + let out = merge([{ a: { x: 21 } }, { a: b0 }]) + deepEqual(out, { a: b0 }) + equal(b0, out.a) + equal(b0.x, 1) + equal(b0 instanceof Bar, true) + + out = merge([{}, { b: b0 }]) + deepEqual(out, { b: b0 }) + equal(b0, out.b) + equal(b0.x, 1) + equal(b0 instanceof Bar, true) }) @@ -309,38 +507,34 @@ describe('struct', async () => { // ============= test('getpath-basic', async () => { - await runset(getpathSpec.basic, (vin) => getpath(vin.path, vin.store)) + await runset(spec.getpath.basic, (vin) => struct.getpath(vin.store, vin.path)) }) - - test('getpath-current', async () => { - await runset(getpathSpec.current, (vin) => - getpath(vin.path, vin.store, vin.current)) + test('getpath-relative', async () => { + await runset(spec.getpath.relative, (vin) => + struct.getpath(vin.store, vin.path, + { dparent: vin.dparent, dpath: vin.dpath?.split('.') })) }) + test('getpath-special', async () => { + await runset(spec.getpath.special, (vin) => + struct.getpath(vin.store, vin.path, vin.inj)) + }) - test('getpath-state', async () => { - const state = { - handler: (state, val, _current, _ref, _store) => { - let out = state.meta.step + ':' + val - state.meta.step++ - return out - }, - meta: { step: 0 }, - mode: 'val', - full: false, - keyI: 0, - keys: ['$TOP'], - key: '$TOP', - val: '', - parent: {}, - path: ['$TOP'], - nodes: [{}], - base: '$TOP', - errs: [], - } - await runset(getpathSpec.state, (vin) => - getpath(vin.path, vin.store, vin.current, state)) + test('getpath-handler', async () => { + await runset(spec.getpath.handler, (vin) => + struct.getpath( + { + $TOP: vin.store, + $FOO: () => 'foo', + }, + vin.path, + { + handler: (_inj, val, _ref, _store) => { + return val() + } + } + )) }) @@ -348,19 +542,18 @@ describe('struct', async () => { // ============ test('inject-basic', async () => { - const test = clone(injectSpec.basic) + const { clone, inject } = struct + const test = clone(spec.inject.basic) deepEqual(inject(test.in.val, test.in.store), test.out) }) - test('inject-string', async () => { - await runset(injectSpec.string, (vin) => - inject(vin.val, vin.store, nullModifier, vin.current)) + await runset(spec.inject.string, (vin) => + struct.inject(vin.val, vin.store, { modify: nullModifier })) }) - test('inject-deep', async () => { - await runset(injectSpec.deep, (vin) => inject(vin.val, vin.store)) + await runset(spec.inject.deep, (vin) => struct.inject(vin.val, vin.store)) }) @@ -368,55 +561,76 @@ describe('struct', async () => { // =============== test('transform-basic', async () => { - const test = clone(transformSpec.basic) - deepEqual(transform(test.in.data, test.in.spec, test.in.store), test.out) + const { clone, transform } = struct + const test = clone(spec.transform.basic) + deepEqual(transform(test.in.data, test.in.spec), test.out) }) - test('transform-paths', async () => { - await runset(transformSpec.paths, (vin) => - transform(vin.data, vin.spec, vin.store)) + await runset(spec.transform.paths, (vin) => + struct.transform(vin.data, vin.spec)) }) - test('transform-cmds', async () => { - await runset(transformSpec.cmds, (vin) => - transform(vin.data, vin.spec, vin.store)) + await runset(spec.transform.cmds, (vin) => + struct.transform(vin.data, vin.spec)) }) - test('transform-each', async () => { - await runset(transformSpec.each, (vin) => - transform(vin.data, vin.spec, vin.store)) + await runset(spec.transform.each, (vin) => + struct.transform(vin.data, vin.spec)) }) - test('transform-pack', async () => { - await runset(transformSpec.pack, (vin) => - transform(vin.data, vin.spec, vin.store)) + await runset(spec.transform.pack, (vin) => + struct.transform(vin.data, vin.spec)) }) + test('transform-ref', async () => { + await runset(spec.transform.ref, (vin) => + struct.transform(vin.data, vin.spec)) + }) + + test('transform-format', async () => { + await runsetflags(spec.transform.format, { null: false }, (vin) => + struct.transform(vin.data, vin.spec)) + }) + + test('transform-apply', async () => { + await runset(spec.transform.apply, (vin) => + struct.transform(vin.data, vin.spec)) + }) + + test('transform-edge-apply', async () => { + const { transform } = struct + equal(2, transform({}, ['`$APPLY`', (v) => 1 + v, 1])) + }) test('transform-modify', async () => { - await runset(transformSpec.modify, (vin) => - transform(vin.data, vin.spec, vin.store, - (val, key, parent) => { - if (null != key && null != parent && 'string' === typeof val) { - val = parent[key] = '@' + val + await runset(spec.transform.modify, (vin) => + struct.transform( + vin.data, + vin.spec, + { + modify: (val, key, parent) => { + if (null != key && null != parent && 'string' === typeof val) { + val = parent[key] = '@' + val + } } } )) }) - test('transform-extra', async () => { - deepEqual(transform( + deepEqual(struct.transform( { a: 1 }, { x: '`a`', b: '`$COPY`', c: '`$UPPER`' }, { - b: 2, $UPPER: (state) => { - const { path } = state - return ('' + getprop(path, path.length - 1)).toUpperCase() + extra: { + b: 2, $UPPER: (inj) => { + const { path } = inj + return ('' + struct.getprop(path, path.length - 1)).toUpperCase() + } } } ), { @@ -426,8 +640,8 @@ describe('struct', async () => { }) }) - test('transform-funcval', async () => { + const { transform } = struct const f0 = () => 99 deepEqual(transform({}, { x: 1 }), { x: 1 }) deepEqual(transform({}, { x: f0 }), { x: f0 }) @@ -440,41 +654,63 @@ describe('struct', async () => { // =============== test('validate-basic', async () => { - await runset(validateSpec.basic, (vin) => validate(vin.data, vin.spec)) + await runsetflags(spec.validate.basic, { null: false }, + (vin) => struct.validate(vin.data, vin.spec)) }) - test('validate-child', async () => { - await runset(validateSpec.child, (vin) => validate(vin.data, vin.spec)) + await runset(spec.validate.child, (vin) => struct.validate(vin.data, vin.spec)) }) - test('validate-one', async () => { - await runset(validateSpec.one, (vin) => validate(vin.data, vin.spec)) + await runset(spec.validate.one, (vin) => struct.validate(vin.data, vin.spec)) }) - test('validate-exact', async () => { - await runset(validateSpec.exact, (vin) => validate(vin.data, vin.spec)) + await runset(spec.validate.exact, (vin) => struct.validate(vin.data, vin.spec)) }) - test('validate-invalid', async () => { - await runsetflags(validateSpec.invalid, { null: false }, - (vin) => validate(vin.data, vin.spec)) + await runsetflags(spec.validate.invalid, { null: false }, + (vin) => struct.validate(vin.data, vin.spec)) }) + test('validate-special', async () => { + await runset(spec.validate.special, (vin) => + struct.validate(vin.data, vin.spec, vin.inj)) + }) + + test('validate-edge', async () => { + const { validate } = struct + let errs = [] + validate({ x: 1 }, { x: '`$INSTANCE`' }, { errs }) + equal(errs[0], 'Expected field x to be instance, but found integer: 1.') + + errs = [] + validate({ x: {} }, { x: '`$INSTANCE`' }, { errs }) + equal(errs[0], 'Expected field x to be instance, but found map: {}.') + + errs = [] + validate({ x: [] }, { x: '`$INSTANCE`' }, { errs }) + equal(errs[0], 'Expected field x to be instance, but found list: [].') + + class C { } + const c = new C() + errs = [] + validate({ x: c }, { x: '`$INSTANCE`' }, { errs }) + equal(errs.length, 0) + }) test('validate-custom', async () => { const errs = [] const extra = { - $INTEGER: (state, _val, current) => { - const { key } = state - let out = getprop(current, key) + $INTEGER: (inj) => { + const { key } = inj + let out = struct.getprop(inj.dparent, key) let t = typeof out if ('number' !== t && !Number.isInteger(out)) { - state.errs.push('Not an integer at ' + state.path.slice(1).join('.') + ': ' + out) + inj.errs.push('Not an integer at ' + inj.path.slice(1).join('.') + ': ' + out) return } @@ -484,13 +720,62 @@ describe('struct', async () => { const shape = { a: '`$INTEGER`' } - let out = validate({ a: 1 }, shape, extra, errs) + let out = struct.validate({ a: 1 }, shape, { extra, errs }) deepEqual(out, { a: 1 }) equal(errs.length, 0) - out = validate({ a: 'A' }, shape, extra, errs) + out = struct.validate({ a: 'A' }, shape, { extra, errs }) deepEqual(out, { a: 'A' }) deepEqual(errs, ['Not an integer at a: A']) }) + + // select tests + // ============ + + test('select-basic', async () => { + await runset(spec.select.basic, (vin) => struct.select(vin.obj, vin.query)) + }) + + test('select-operators', async () => { + await runset(spec.select.operators, (vin) => struct.select(vin.obj, vin.query)) + }) + + test('select-edge', async () => { + await runset(spec.select.edge, (vin) => struct.select(vin.obj, vin.query)) + }) + + test('select-alts', async () => { + await runset(spec.select.alts, (vin) => struct.select(vin.obj, vin.query)) + }) + + + // JSON Builder + // ============ + + test('json-builder', async () => { + const { jsonify, jm, jt } = struct + equal(jsonify(jm( + 'a', 1 + )), '{\n "a": 1\n}') + + equal(jsonify(jt( + 'b', 2 + )), '[\n "b",\n 2\n]') + + equal(jsonify(jm( + 'c', 'C', + 'd', jm('x', true), + 'e', jt(null, false) + )), '{\n "c": "C",\n "d": {\n "x": true\n },\n "e": [\n null,\n false\n ]\n}') + + equal(jsonify(jm( + true, 1, + false, 2, + null, 3, + ['a'], 4, + { 'b': 0 }, 5 + )), '{\n "true": 1,\n "false": 2,\n "null": 3,\n "[a]": 4,\n "{b:0}": 5\n}') + }) + }) diff --git a/lua/NOTES.md b/lua/NOTES.md new file mode 100644 index 00000000..41235366 --- /dev/null +++ b/lua/NOTES.md @@ -0,0 +1,29 @@ +# Lua Implementation Notes + +## undefined vs null + +Lua has only `nil` — there is no native distinction between "absent" and "null". +Additionally, setting a table key to `nil` removes the key entirely. +For this library: +- `nil` is used to represent **property absence** (the TypeScript `undefined` equivalent). +- TypeScript tests relating to `undefined` should be treated as **property absence**: the key + does not exist in the table, or the function parameter was not provided. +- JSON null is ambiguous with `nil`. Where the distinction matters, the test runner uses + marker strings: `NULLMARK = '__NULL__'` for JSON null and `UNDEFMARK = '__UNDEF__'` for absent values. +- Since `nil` cannot be stored as a table value (it deletes the key), a sentinel value + (e.g., `json.null` from the JSON library) may be needed where JSON null must be preserved. +- In practice, most APIs do not use JSON null, so this ambiguity rarely causes issues. + +## Type System + +This implementation uses bitfield integers for the type system, matching the TypeScript canonical. +Type constants (`T_any`, `T_noval`, `T_boolean`, etc.) are exported and `typify()` returns +integer bitfields. Use `typename()` to get the human-readable name for error messages. +Bitwise operations allow composite type checks (e.g., `T_scalar | T_string`). + +## 1-Based Indexing + +Lua tables use 1-based indexing internally. The library translates to 0-based indexing at the +API boundary to match the cross-language test suite and TypeScript canonical behavior. All +external-facing index values (path arrays, `keysof` output, `getprop`/`setprop` keys) use +0-based integers. diff --git a/lua/REVIEW.md b/lua/REVIEW.md new file mode 100644 index 00000000..33c384bf --- /dev/null +++ b/lua/REVIEW.md @@ -0,0 +1,167 @@ +# Lua (lua) - Review vs TypeScript Canonical + +## Overview + +The Lua version is a comprehensive implementation with **39 exported functions**, closely tracking the TypeScript canonical. It uses the unified `injdef` table pattern, has full type constants, and supports all major operations (inject, transform, validate, select). The primary challenges come from Lua's fundamental language differences: 1-based indexing, tables for both maps and lists, and no native distinction between arrays and objects. + +--- + +## Missing Functions + +| Function | Category | Impact | +|----------|----------|--------| +| `replace` | String | No unified string replace wrapper | +| `jm`/`jt` | JSON builders | No JSON builder functions (less needed in Lua since tables are flexible) | + +--- + +## Naming Differences + +All function names are lowercase in Lua (matching TS convention). No significant naming differences. + +--- + +## API Signature Differences + +### 1. `injdef` is a plain table + +- **TS**: `injdef` is `Partial` with typed fields. +- **Lua**: `injdef` is a plain table with the same field names. +- **Notes**: Functionally equivalent; Lua doesn't have typed interfaces. + +### 2. `items` returns `{key, val}` tables instead of `[key, val]` arrays + +- **TS**: Returns `[string, any][]` - array of 2-element tuples. +- **Lua**: Returns array of `{key, val}` tables (named fields). +- **Impact**: Different access pattern: `item[1]`/`item[2]` in TS vs `item.key`/`item.val` in Lua. May complicate cross-language test alignment. + +### 3. `clone` accepts `flags` table + +- **TS**: `clone(val)` - no flags. +- **Lua**: `clone(val, flags)` where `flags.func` controls function cloning. +- **Notes**: Extra feature, not a divergence. + +--- + +## Significant Language Difference Issues + +### 1. 1-Based Indexing (Critical) + +- **Issue**: Lua arrays are 1-based, while JavaScript/TypeScript arrays are 0-based. This affects every function that deals with list indices. +- **Areas affected**: + - `getprop`/`setprop`/`getelem` must translate between 0-based external API and 1-based internal Lua tables. + - `slice` start/end parameters use 0-based convention externally but 1-based internally. + - Path arrays use 0-based string indices to match the cross-language test.json format. + - `keysof` for lists returns 0-based string indices (`"0"`, `"1"`, `"2"`) to match TS, despite Lua tables being 1-based internally. +- **Impact**: This is the single largest source of potential bugs. Every index translation is an off-by-one risk. The implementation handles this via explicit `+ 1` / `- 1` adjustments. +- **Recommendation**: Thorough edge case testing for all index boundary conditions (empty lists, single-element lists, negative indices, out-of-bounds indices). + +### 2. Tables Are Both Maps and Lists (Critical) + +- **Issue**: Lua has a single `table` type for both arrays (sequential integer keys) and maps (string keys). There is no native way to distinguish an empty array `[]` from an empty object `{}`. +- **Workaround**: Uses metatables with `__jsontype` field (`"array"` or `"object"`) to tag tables. +- **Impact**: + - `ismap` and `islist` must check metatables or infer from key types. + - JSON serialization must preserve the array/object distinction. + - `isnode` must handle both cases. + - Empty tables are ambiguous without metatable tagging. +- **Recommendation**: Ensure all functions that create tables set appropriate metatables. Test empty table edge cases thoroughly. + +### 3. No `undefined` vs `null` Distinction + +- **Issue**: Lua has only `nil`. Setting a table key to `nil` removes it entirely. +- **Impact**: + - Cannot store `nil` as a value in a table (it deletes the key). + - Cannot distinguish "key absent" from "key set to null". + - The `NULLMARK`/`UNDEFMARK` marker system in the test runner handles this for testing. +- **Recommendation**: Consider a sentinel value for JSON null (e.g., `json.null` or a special table) to distinguish from absent keys. + +### 4. No Native JSON Type + +- **Issue**: Lua has no built-in JSON support. Relies on external JSON library (e.g., `cjson`, `dkjson`). +- **Impact**: JSON encoding/decoding behavior depends on which library is used. Different libraries handle edge cases differently (e.g., sparse arrays, special float values). + +### 5. String Patterns vs Regular Expressions + +- **Issue**: Lua uses its own pattern matching syntax, not POSIX or PCRE regular expressions. +- **Impact**: + - `escre` must escape Lua pattern special characters (`^$()%.[]*+-?`), which differ from regex special characters. + - `select` query with `$LIKE` operator must use Lua patterns, not regex. + - String matching in the test runner uses Lua patterns. +- **Recommendation**: Document that `escre` escapes Lua patterns, not standard regex. Consider if this behavioral difference is acceptable or if a regex library should be used. + +### 6. No Integer Type (Pre-Lua 5.3) + +- **Issue**: Lua 5.1/5.2 have only `number` (double-precision float). Lua 5.3+ added integer subtype. +- **Impact**: `typify` must detect whether a number is an integer or decimal. On Lua 5.3+, `math.type()` helps. On older versions, must check `val == math.floor(val)`. +- **Recommendation**: Ensure compatibility with target Lua version. + +### 7. No Closures as "Functions" for `isfunc` + +- **Issue**: Lua functions and closures are both type `"function"`, which aligns well with TS. However, callable tables (with `__call` metamethod) may or may not be detected. +- **Impact**: `isfunc` using `type(val) == "function"` won't detect callable tables. + +### 8. Table Length Operator `#` Unreliable for Sparse Arrays + +- **Issue**: The `#` operator on tables with holes (nil gaps) has undefined behavior. +- **Impact**: `size` for lists must be careful about sparse arrays. The implementation likely uses explicit iteration. + +--- + +## Validation Differences + +- **TS**: Uses `$MAP`, `$LIST`, `$STRING`, `$NUMBER`, `$INTEGER`, `$DECIMAL`, `$BOOLEAN`, `$NULL`, `$NIL`, `$FUNCTION`, `$INSTANCE`, `$ANY`, `$CHILD`, `$ONE`, `$EXACT`. +- **Lua**: Same validator set present. +- **Notes**: Aligned. + +--- + +## Transform Differences + +- **TS**: Full set of transform commands. +- **Lua**: Full set including `$DELETE`, `$COPY`, `$KEY`, `$ANNO`, `$MERGE`, `$EACH`, `$PACK`, `$REF`, `$FORMAT`, `$APPLY`. +- **Notes**: Aligned. + +--- + +## Test Coverage + +Lua tests cover all major categories: +- Existence tests (116 checks), minor functions, walk, merge, getpath, inject, transform, validate, select, JSON builders. +- Uses shared `test.json` spec via `busted` test framework. +- Comprehensive test organization matching TS. + +--- + +## Alignment Plan + +### Phase 1: Index Boundary Verification (High Priority) +1. Audit all 0-based/1-based index translations in `getprop`, `setprop`, `getelem`, `delprop` +2. Add edge case tests for: empty list, single-element list, negative indices, boundary indices +3. Verify `slice` parameter translation matches TS behavior exactly +4. Verify `keysof` returns 0-based string indices for lists + +### Phase 2: Table Type Disambiguation +5. Audit metatable usage for array/object distinction +6. Ensure all table-creating functions set correct `__jsontype` metatable +7. Test empty table edge cases: `ismap({})`, `islist({})`, `isempty({})` +8. Verify `clone` preserves metatable tags + +### Phase 3: Missing Functions +9. Add `replace(s, from, to)` function +10. Consider adding `jm`/`jt` builders (may alias table constructors) + +### Phase 4: Pattern vs Regex Alignment +11. Document that `escre` escapes Lua patterns, not standard regex +12. Verify `select` `$LIKE` operator uses consistent pattern syntax +13. Consider adding a PCRE wrapper for cross-language consistency + +### Phase 5: Null Handling +14. Review JSON null representation throughout the codebase +15. Ensure nil-in-table edge cases are handled correctly +16. Test `inject`/`transform`/`validate` with null values in various positions + +### Phase 6: Full Test Suite Verification +17. Run complete test suite against shared `test.json` +18. Compare results with TS output for any discrepancies +19. Document any intentional Lua-specific behavioral differences diff --git a/lua/src/struct.lua b/lua/src/struct.lua index 12deadb0..23608b20 100644 --- a/lua/src/struct.lua +++ b/lua/src/struct.lua @@ -600,6 +600,27 @@ local function escurl(s) end +-- Replace a search string (all), or a pattern, in a source string. +local function replace(s, from, to) + local rs = s + local ts = typify(s) + if 0 == (T_string & ts) then + rs = stringify(s) + elseif 0 < ((T_noval | T_null) & ts) then + rs = S_MT + else + rs = stringify(s) + end + if type(from) == 'string' then + -- Plain string replacement (all occurrences) + return (rs:gsub(escre(from), to)) + else + -- Pattern replacement + return (rs:gsub(from, to)) + end +end + + -- Return a sub-array. Start and end are 0-based, end is exclusive. -- For numbers: clamp between start and end-1. -- For strings: substring from start to end. @@ -3266,6 +3287,7 @@ local StructUtility = { merge = merge, pad = pad, pathify = pathify, + replace = replace, select = select_fn, setpath = setpath, setprop = setprop, @@ -3305,6 +3327,11 @@ local StructUtility = { checkPlacement = checkPlacement, injectorArgs = injectorArgs, injectChild = injectChild, + + M_KEYPRE = M_KEYPRE, + M_KEYPOST = M_KEYPOST, + M_VAL = M_VAL, + MODENAME = MODENAME, } StructUtility.__index = StructUtility @@ -3342,6 +3369,7 @@ return { merge = merge, pad = pad, pathify = pathify, + replace = replace, select = select_fn, setpath = setpath, setprop = setprop, diff --git a/php/NOTES.md b/php/NOTES.md new file mode 100644 index 00000000..9d6ffe1e --- /dev/null +++ b/php/NOTES.md @@ -0,0 +1,22 @@ +# PHP Implementation Notes + +## undefined vs null + +PHP has only `null` — there is no native distinction between "absent" and "null". +For this library: +- The constant `UNDEF = '__UNDEFINED__'` is used as a sentinel for **property absence** + (the TypeScript `undefined` equivalent). +- TypeScript tests relating to `undefined` should be treated as **property absence**: the key + does not exist in the object/array, or the function parameter was not provided. +- JSON null is ambiguous with `null`. Where the distinction matters, the test runner uses + marker strings: `NULLMARK = '__NULL__'` for JSON null and `UNDEFMARK = '__UNDEF__'` for absent values. +- Note: the string sentinel `'__UNDEFINED__'` could theoretically collide with real data. + Consider using a unique object instance in future refactors. +- In practice, most APIs do not use JSON null, so this ambiguity rarely causes issues. + +## Type System + +This implementation uses bitfield integers for the type system, matching the TypeScript canonical. +Type constants (`T_any`, `T_noval`, `T_boolean`, etc.) are defined as class constants on `Struct` +and `typify()` returns integer bitfields. Use `typename()` to get the human-readable name for +error messages. Bitwise operations allow composite type checks (e.g., `T_scalar | T_string`). diff --git a/php/REVIEW.md b/php/REVIEW.md new file mode 100644 index 00000000..b95fbc8a --- /dev/null +++ b/php/REVIEW.md @@ -0,0 +1,187 @@ +# PHP (php) - Review vs TypeScript Canonical + +## Overview + +The PHP version is a comprehensive implementation with **40+ functions** as static methods on the `Struct` class. It covers all major operations (inject, transform, validate, select) and has an extensive test suite (75+ test methods). The main differences are an older API pattern (positional parameters for inject/transform instead of unified `injdef`), reversed parameter order for `select`, and PHP-specific type handling. + +--- + +## Missing Functions + +| Function | Category | Impact | +|----------|----------|--------| +| `replace` | String | No unified string replace wrapper | +| `getdef` | Property access | No defined-or-default helper | +| `jm`/`jt` | JSON builders | No JSON builder functions | +| `typename` | Type system | Exists but verify alignment | + +--- + +## Naming Differences + +| TS Name | PHP Name | Notes | +|---------|----------|-------| +| All functions | `Struct::functionName` | Static methods on class | +| `escre` | `escre` (was `escapeRegex` in older version) | May have been renamed | +| `escurl` | `escurl` (was `escapeUrl` in older version) | May have been renamed | + +--- + +## API Signature Differences + +### 1. `inject` uses positional parameters instead of `injdef` + +- **TS**: `inject(val, store, injdef?)` where `injdef` is `Partial`. +- **PHP**: `inject($val, $store, $modify, $current, $injdef)` - separate positional params. +- **Impact**: Less extensible; harder to add new options. + +### 2. `transform` uses positional parameters instead of `injdef` + +- **TS**: `transform(data, spec, injdef?)`. +- **PHP**: `transform($data, $spec, $extra, $modify)` - separate params. +- **Impact**: Same extensibility concern. + +### 3. `validate` uses `injdef` but partially + +- **TS**: `validate(data, spec, injdef?)`. +- **PHP**: `validate($data, $spec, $injdef)` - closer to TS but `injdef` may be differently structured. + +### 4. `select` has reversed parameter order + +- **TS**: `select(children, query)` - children first, then query. +- **PHP**: `select($query, $children)` - query first, then children. +- **Impact**: **Breaking API difference**. Must be aligned. + +### 5. `getpath` uses older positional parameters + +- **TS**: `getpath(store, path, injdef?)`. +- **PHP**: `getpath($path, $store, $current, $state)` - path first, positional params. +- **Impact**: Different parameter order from TS canonical. + +### 6. `walk` signature + +- **TS**: `walk(val, before?, after?, maxdepth?, key?, parent?, path?)`. +- **PHP**: `walk($val, $before, $after, $maxdepth, $key, $parent, $path)` - matching TS. +- **Notes**: Correctly aligned with before/after/maxdepth pattern. + +--- + +## Validation Differences + +### Validator Names +- **TS**: `$MAP`, `$LIST`, `$STRING`, `$NUMBER`, `$INTEGER`, `$DECIMAL`, `$BOOLEAN`, `$NULL`, `$NIL`, `$FUNCTION`, `$INSTANCE`, `$ANY`, `$CHILD`, `$ONE`, `$EXACT`. +- **PHP**: `$OBJECT`, `$ARRAY`, `$STRING`, `$NUMBER`, `$BOOLEAN`, `$FUNCTION`, `$ANY`, `$CHILD`, `$ONE`, `$EXACT`. +- **Missing**: `$MAP` (uses `$OBJECT`), `$LIST` (uses `$ARRAY`), `$INTEGER`, `$DECIMAL`, `$NULL`, `$NIL`, `$INSTANCE`. +- **Impact**: Cannot distinguish integer from decimal validation; no null/nil validators. + +--- + +## Transform Differences + +- **TS**: `$DELETE`, `$COPY`, `$KEY`, `$ANNO`, `$MERGE`, `$EACH`, `$PACK`, `$REF`, `$FORMAT`, `$APPLY`, `$BT`, `$DS`, `$WHEN`. +- **PHP**: `$DELETE`, `$COPY`, `$KEY`, `$META`, `$ANNO`, `$MERGE`, `$EACH`, `$PACK`, `$REF`. Missing: `$FORMAT`, `$APPLY`, `$BT`, `$DS`, `$WHEN`. +- **Impact**: Cannot format strings or apply custom functions in transforms; no backtick/dollar escaping. + +--- + +## Significant Language Difference Issues + +### 1. `UNDEF` Is a String Constant + +- **Issue**: PHP uses `const UNDEF = '__UNDEFINED__'` (a string) as sentinel for absent values. +- **Impact**: If a real data value happens to be the string `'__UNDEFINED__'`, it will be misinterpreted as absent. This is unlikely but theoretically possible. +- **Recommendation**: Consider using a unique object instance (e.g., `new \stdClass()`) as sentinel instead of a string. + +### 2. PHP Arrays Are Both Lists and Maps + +- **Issue**: PHP arrays serve as both sequential lists and associative maps. `array(1, 2, 3)` and `array('a' => 1)` are the same type. +- **Impact**: `islist` must check for sequential integer keys starting at 0. `ismap` must detect non-sequential or string keys. This is fragile - operations that delete elements can turn a list into a map (non-sequential indices). +- **Recommendation**: Ensure `delprop` on lists re-indexes to maintain sequential keys. + +### 3. Objects vs Arrays for Maps + +- **Issue**: PHP can represent JSON objects as either `stdClass` objects or associative arrays. The library appears to use `stdClass` for maps in some contexts and arrays in others. +- **Impact**: `ismap` must handle both `is_object($val)` and associative arrays. Inconsistent representation can cause type-check failures. +- **Recommendation**: Standardize on one representation (preferably `stdClass` for maps to avoid list/map ambiguity). + +### 4. No `undefined` vs `null` Distinction + +- **Issue**: PHP has only `null`. The `UNDEF` string constant is used as a workaround. +- **Impact**: Same fundamental issue as Python/Lua/Go. Property access cannot distinguish "key absent" from "key is null". + +### 5. Pass-by-Value Semantics for Arrays + +- **Issue**: PHP arrays are copy-on-write. `setprop` uses `&$parent` (pass by reference) to modify in place. +- **Impact**: Callers must be careful about reference semantics. Some functions may unexpectedly create copies. + +### 6. No Function Overloading + +- **Issue**: PHP doesn't support function overloading. The `items` function uses an optional `$apply` callback parameter. +- **Notes**: This matches the TS approach (overloaded signatures compiled to single implementation). + +### 7. Weak Typing in Comparisons + +- **Issue**: PHP's `==` operator performs type coercion (`0 == ""` is true, `0 == "0"` is true). +- **Impact**: Comparisons in `select`, `validate`, and `haskey` must use `===` strict equality where appropriate. +- **Recommendation**: Audit all equality comparisons for strict vs loose equality usage. + +### 8. No Symbol Type + +- **Issue**: PHP has no equivalent of JavaScript Symbol. +- **Impact**: `T_symbol` type constant exists but `typify` will never return it. Minimal impact. + +--- + +## Test Coverage + +PHP has comprehensive test coverage (75+ test methods) covering: +- All minor functions, walk, merge, getpath, inject, transform, validate, select. +- Edge case tests for most functions. +- Uses shared `test.json` spec via PHPUnit. + +### Minor Gaps +- Some newer TS test categories may not be present (e.g., `transform-format`, `transform-apply` if those commands aren't implemented). + +--- + +## Alignment Plan + +### Phase 1: Critical API Fixes +1. Fix `select` parameter order to `select($children, $query)` to match TS +2. Align `getpath` parameter order to `getpath($store, $path, $injdef)` +3. Refactor `inject` to use `$injdef` object parameter instead of positional params +4. Refactor `transform` to use `$injdef` object parameter + +### Phase 2: Missing Validators +5. Add `$MAP` validator (alias or replacement for `$OBJECT`) +6. Add `$LIST` validator (alias or replacement for `$ARRAY`) +7. Add `$INTEGER` validator +8. Add `$DECIMAL` validator +9. Add `$NULL` and `$NIL` validators +10. Add `$INSTANCE` validator + +### Phase 3: Missing Transform Commands +11. Add `$FORMAT` transform command +12. Add `$APPLY` transform command +13. Add `$BT` (backtick escape) transform command +14. Add `$DS` (dollar sign escape) transform command +15. Add `$WHEN` (timestamp) transform command + +### Phase 4: Missing Functions +16. Add `getdef($val, $alt)` function +17. Add `replace($s, $from, $to)` function +18. Consider adding `jm`/`jt` JSON builder functions + +### Phase 5: UNDEF Sentinel Improvement +19. Consider replacing string `UNDEF` with object sentinel +20. Audit all `UNDEF` comparisons for correctness + +### Phase 6: Type System Alignment +21. Verify `typify` returns matching bitfield values +22. Verify `typename` output matches TS +23. Add any missing type constants + +### Phase 7: Test Alignment +24. Add tests for new validators and transform commands +25. Verify all test categories from TS `test.json` are covered +26. Fix any test failures from API changes diff --git a/php/src/Struct.php b/php/src/Struct.php index 733ad179..3853b3b0 100644 --- a/php/src/Struct.php +++ b/php/src/Struct.php @@ -3,6 +3,57 @@ namespace Voxgig\Struct; +/** + * Reference-stable wrapper for PHP arrays. + * PHP arrays are value types (copy-on-write), so storing them in injection + * state loses reference identity. ListRef wraps the array in an object + * (reference type) so mutations via setval/delprop propagate through the + * injection pipeline. Mirrors Go's ListRef[T] strategy. + */ +class ListRef implements \ArrayAccess, \Countable, \IteratorAggregate +{ + public array $list; + + public function __construct(array $list = []) + { + $this->list = $list; + } + + public function offsetExists(mixed $offset): bool + { + return isset($this->list[$offset]); + } + + public function offsetGet(mixed $offset): mixed + { + return $this->list[$offset] ?? null; + } + + public function offsetSet(mixed $offset, mixed $value): void + { + if ($offset === null) { + $this->list[] = $value; + } else { + $this->list[$offset] = $value; + } + } + + public function offsetUnset(mixed $offset): void + { + array_splice($this->list, (int)$offset, 1); + } + + public function count(): int + { + return count($this->list); + } + + public function getIterator(): \ArrayIterator + { + return new \ArrayIterator($this->list); + } +} + /** * Class Struct * @@ -87,7 +138,12 @@ class Struct /** * Private marker to indicate a skippable value. */ - private static array $SKIP = ['__SKIP__' => true]; + private static array $SKIP = ['`$SKIP`' => true]; + + // Mode constants (bitfield) matching TypeScript canonical + public const M_KEYPRE = 1; + public const M_KEYPOST = 2; + public const M_VAL = 4; /* ======================= * Regular expressions for validation and transformation @@ -119,11 +175,12 @@ private static function isListHelper(array $val): bool public static function isnode(mixed $val): bool { - // We don't consider null or the undef‐marker to be a node. if ($val === self::UNDEF || $val === null) { return false; } - // Any PHP object *or* any PHP array is a node (map or list). + if ($val instanceof ListRef) { + return true; + } return is_object($val) || is_array($val); } @@ -137,7 +194,9 @@ public static function isnode(mixed $val): bool */ public static function ismap(mixed $val): bool { - // Any PHP object (stdClass, etc.) is a map + if ($val instanceof ListRef) { + return false; + } if (is_object($val)) { return true; } @@ -162,6 +221,9 @@ public static function ismap(mixed $val): bool */ public static function islist(mixed $val): bool { + if ($val instanceof ListRef) { + return true; + } if (!is_array($val)) { return false; } @@ -248,6 +310,9 @@ public static function typify(mixed $value): int if (is_callable($value) && !is_array($value) && !is_object($value)) { return self::T_scalar | self::T_function; } + if ($value instanceof ListRef) { + return self::T_node | self::T_list; + } if (is_array($value)) { if (self::islist($value)) { return self::T_node | self::T_list; @@ -270,6 +335,58 @@ public static function typename(int $type): string return self::TYPENAME[$clz] ?? self::TYPENAME[0]; } + /** + * Get a defined value. Returns alt if val is undefined. + */ + public static function getdef(mixed $val, mixed $alt): mixed + { + if ($val === self::UNDEF || $val === null) { + return $alt; + } + return $val; + } + + /** + * Replace a search string (all), or a regex pattern, in a source string. + */ + public static function replace(string $s, string|array $from, mixed $to): string + { + $rs = $s; + $ts = self::typify($s); + if (0 === (self::T_string & $ts)) { + $rs = self::stringify($s); + } elseif (0 < ((self::T_noval | self::T_null) & $ts)) { + $rs = self::S_MT; + } + if (is_string($from) && @preg_match($from, '') !== false && $from[0] === '/') { + return preg_replace($from, (string)$to, $rs); + } + return str_replace((string)$from, (string)$to, $rs); + } + + /** + * Define a JSON Object using key-value arguments. + */ + public static function jm(mixed ...$kv): object + { + $kvsize = count($kv); + $o = new \stdClass(); + for ($i = 0; $i < $kvsize; $i += 2) { + $k = $kv[$i] ?? ('$KEY' . $i); + $k = is_string($k) ? $k : self::stringify($k); + $o->$k = $kv[$i + 1] ?? null; + } + return $o; + } + + /** + * Define a JSON Array using arguments. + */ + public static function jt(mixed ...$v): array + { + return array_values($v); + } + public static function getprop(mixed $val, mixed $key, mixed $alt = self::UNDEF): mixed { // 1) undefined‐marker or invalid key → alt @@ -283,11 +400,16 @@ public static function getprop(mixed $val, mixed $key, mixed $alt = self::UNDEF) return $alt; } - // 2) array branch stays the same - if (is_array($val) && array_key_exists($key, $val)) { + // 2) ListRef branch + if ($val instanceof ListRef) { + $ki = is_numeric($key) ? (int)$key : -1; + $out = ($ki >= 0 && $ki < count($val->list)) ? $val->list[$ki] : $alt; + } + // 3) array branch + elseif (is_array($val) && array_key_exists($key, $val)) { $out = $val[$key]; } - // 3) object branch: cast $key to string + // 4) object branch: cast $key to string elseif (is_object($val)) { $prop = (string) $key; if (property_exists($val, $prop)) { @@ -341,9 +463,10 @@ public static function keysof(mixed $val): array $keys = is_array($val) ? array_keys($val) : array_keys(get_object_vars($val)); sort($keys, SORT_STRING); return $keys; + } elseif ($val instanceof ListRef) { + return array_map('strval', array_keys($val->list)); } elseif (self::islist($val)) { - $keys = array_keys($val); - return array_map('strval', $keys); + return array_map('strval', array_keys($val)); } return []; } @@ -468,6 +591,9 @@ public static function jsonify(mixed $val, mixed $flags = null): string $str = 'null'; if ($val !== null && $val !== self::UNDEF && !($val instanceof \Closure)) { + if ($val instanceof ListRef) { + $val = self::cloneUnwrap($val); + } $indent = self::getprop($flags, 'indent', 2); try { $encoded = json_encode($val, JSON_PRETTY_PRINT | JSON_UNESCAPED_SLASHES); @@ -576,13 +702,17 @@ public static function slice(mixed $val, ?int $start = null, ?int $end = null): } if (-1 < $start && $start <= $end && $end <= $vlen) { - if (self::islist($val)) { + if ($val instanceof ListRef) { + $val = new ListRef(array_slice($val->list, $start, $end - $start)); + } elseif (self::islist($val)) { $val = array_slice($val, $start, $end - $start); } elseif (is_string($val)) { $val = substr($val, $start, $end - $start); } } else { - if (self::islist($val)) { + if ($val instanceof ListRef) { + $val = new ListRef([]); + } elseif (self::islist($val)) { $val = []; } elseif (is_string($val)) { $val = self::S_MT; @@ -656,6 +786,10 @@ public static function stringify(mixed $val, ?int $maxlen = null, mixed $pretty $valstr = $val; } else { $original = $val; + // Unwrap ListRefs before JSON encoding + if ($val instanceof ListRef) { + $val = self::cloneUnwrap($val); + } try { $sorted = self::sort_obj($val); $str = json_encode($sorted); @@ -768,9 +902,15 @@ public static function clone(mixed $val): mixed } $refs = []; $replacer = function (mixed $v) use (&$refs, &$replacer): mixed { - if (is_callable($v)) { + if (is_callable($v) && !is_array($v) && !($v instanceof ListRef)) { $refs[] = $v; return '`$FUNCTION:' . (count($refs) - 1) . '`'; + } elseif ($v instanceof ListRef) { + $newList = []; + foreach ($v->list as $item) { + $newList[] = $replacer($item); + } + return new ListRef($newList); } elseif (is_array($v)) { $result = []; foreach ($v as $k => $item) { @@ -795,6 +935,12 @@ public static function clone(mixed $val): mixed return $refs[(int) $matches[1]]; } return $v; + } elseif ($v instanceof ListRef) { + $newList = []; + foreach ($v->list as $item) { + $newList[] = $reviver($item); + } + return new ListRef($newList); } elseif (is_array($v)) { $result = []; foreach ($v as $k => $item) { @@ -815,6 +961,84 @@ public static function clone(mixed $val): mixed return $reviver($temp); } + /** + * Clone a value, wrapping all sequential arrays in ListRef for reference stability. + * Mirrors Go's CloneFlags(val, {wrap: true}). + */ + public static function cloneWrap(mixed $val): mixed + { + if ($val === null || $val === self::UNDEF) { + return $val; + } + if (is_callable($val) && !is_array($val) && !is_object($val)) { + return $val; + } + if ($val instanceof ListRef) { + $newList = []; + foreach ($val->list as $item) { + $newList[] = self::cloneWrap($item); + } + return new ListRef($newList); + } + if (is_array($val)) { + if (self::isListHelper($val) || empty($val)) { + $newList = []; + foreach ($val as $item) { + $newList[] = self::cloneWrap($item); + } + return new ListRef($newList); + } + // Assoc array (map-array) - clone as stdClass + $result = new \stdClass(); + foreach ($val as $k => $v) { + $result->$k = self::cloneWrap($v); + } + return $result; + } + if ($val instanceof \stdClass) { + $result = new \stdClass(); + foreach (get_object_vars($val) as $k => $v) { + $result->$k = self::cloneWrap($v); + } + return $result; + } + // Class instances and scalars returned as-is + return $val; + } + + /** + * Clone a value, unwrapping all ListRef back to plain arrays. + * Mirrors Go's CloneFlags(val, {unwrap: true}). + */ + public static function cloneUnwrap(mixed $val, int $depth = 0): mixed + { + if ($depth > 32) { + return $val; + } + if ($val instanceof ListRef) { + $result = []; + foreach ($val->list as $item) { + $result[] = self::cloneUnwrap($item, $depth + 1); + } + return $result; + } + if ($val instanceof \stdClass) { + $result = new \stdClass(); + foreach (get_object_vars($val) as $k => $v) { + $result->$k = self::cloneUnwrap($v, $depth + 1); + } + return $result; + } + if (is_array($val)) { + $result = []; + foreach ($val as $k => $v) { + $result[$k] = self::cloneUnwrap($v, $depth + 1); + } + return $result; + } + return $val; + } + /** * @internal * Set a property or list‐index on a "node" (stdClass or PHP array). @@ -828,6 +1052,28 @@ public static function setprop(mixed &$parent, mixed $key, mixed $val): mixed return $parent; } + // ─── LISTREF ──────────────────────────────────────────────── + if ($parent instanceof ListRef) { + if (!is_numeric($key)) { + return $parent; + } + $keyI = (int) floor((float) $key); + if ($val === self::UNDEF) { + if ($keyI >= 0 && $keyI < count($parent->list)) { + array_splice($parent->list, $keyI, 1); + } + } elseif ($keyI >= 0) { + if ($keyI >= count($parent->list)) { + $parent->list[] = $val; + } else { + $parent->list[$keyI] = $val; + } + } else { + array_unshift($parent->list, $val); + } + return $parent; + } + // ─── OBJECT (map) ─────────────────────────────────────────── if (is_object($parent)) { $keyStr = self::strkey($key); @@ -997,8 +1243,8 @@ public static function merge(mixed $val, ?int $maxdepth = null): mixed } public static function getpath( - mixed $path, mixed $store, + mixed $path, mixed $current = null, mixed $state = null ): mixed { @@ -1012,7 +1258,7 @@ public static function getpath( } $val = $store; - $base = self::getprop($state, 'base', self::S_DTOP); + $base = self::getprop($state, 'base'); $src = self::getprop($store, $base, $store); $numparts = count($parts); $dparent = self::getprop($state, 'dparent'); @@ -1061,7 +1307,7 @@ public static function getpath( } else if ($state && str_starts_with($part, '$GET:')) { // $GET:path$ -> get store value, use as path part (string) $getpath = substr($part, 5, -1); - $getval = self::getpath($getpath, $src, null, null); + $getval = self::getpath($src, $getpath, null, null); $part = self::stringify($getval); } else if ($state && str_starts_with($part, '$REF:')) { // $REF:refpath$ -> get spec value, use as path part (string) @@ -1079,7 +1325,7 @@ public static function getpath( } } else if ($state && str_starts_with($part, '$META:')) { // $META:metapath$ -> get meta value, use as path part (string) - $part = self::stringify(self::getpath(substr($part, 6, -1), self::getprop($state, 'meta'), null, null)); + $part = self::stringify(self::getpath(self::getprop($state, 'meta'), substr($part, 6, -1), null, null)); } // $$ escapes $ @@ -1110,7 +1356,7 @@ public static function getpath( $fullpath = array_merge($dpath_slice, $parts_slice); if (is_array($dpath) && $ascends <= count($dpath)) { - $val = self::getpath($fullpath, $store, null, null); + $val = self::getpath($store, $fullpath, null, null); } else { $val = self::UNDEF; } @@ -1135,7 +1381,7 @@ public static function getpath( $handler = self::getprop($state, 'handler'); if ($state !== null && self::isfunc($handler)) { $ref = self::pathify($path); - $val = $handler($state, $val, $ref, $store); + $val = call_user_func($handler, $state, $val, $ref, $store); } return $val; @@ -1145,61 +1391,130 @@ public static function getpath( public static function inject( mixed $val, mixed $store, - ?callable $modify = null, - mixed $current = null, - ?object $injdef = null + mixed $injdef = null ): mixed { - // Check if we're using an existing injection state - if ($injdef !== null && property_exists($injdef, 'mode')) { - // Use the existing injection state directly - $state = $injdef; - } else { - // Create a state object to track the injection process - $state = (object) [ - 'mode' => self::S_MVAL, - 'key' => self::S_DTOP, - 'parent' => null, - 'path' => [self::S_DTOP], - 'nodes' => [], - 'keys' => [self::S_DTOP], - 'keyI' => 0, - 'base' => self::S_DTOP, - 'modify' => $modify, - 'full' => false, - 'handler' => [self::class, '_injecthandler'], - 'dparent' => null, - 'dpath' => [self::S_DTOP], - 'errs' => [], - 'meta' => (object) [], - ]; - - // Set up data context - if ($current === null) { - $current = self::getprop($store, self::S_DTOP); - if ($current === self::UNDEF) { - $current = $store; + $valtype = gettype($val); + + /** @var Injection $inj */ + $inj = $injdef; + + // Create state if at root of injection. The input value is placed + // inside a virtual parent holder to simplify edge cases. + if (self::UNDEF === $injdef || null === $injdef || !($injdef instanceof Injection)) { + $inj = new Injection($val, (object) [self::S_DTOP => $val]); + $inj->dparent = $store; + $inj->errs = self::getprop($store, self::S_DERRS, []); + if (!isset($inj->meta->__d)) { + $inj->meta->__d = 0; + } + + if (self::UNDEF !== $injdef && null !== $injdef) { + $inj->modify = (is_object($injdef) && property_exists($injdef, 'modify') && null !== $injdef->modify) ? $injdef->modify : $inj->modify; + $inj->extra = (is_object($injdef) && property_exists($injdef, 'extra') && null !== $injdef->extra) ? $injdef->extra : ($inj->extra ?? null); + $inj->meta = (is_object($injdef) && property_exists($injdef, 'meta') && null !== $injdef->meta) ? $injdef->meta : $inj->meta; + $inj->handler = (is_object($injdef) && property_exists($injdef, 'handler') && null !== $injdef->handler) ? $injdef->handler : $inj->handler; + } + } + + $inj->descend(); + + // Descend into node. + if (self::isnode($val)) { + $nodekeys = self::keysof($val); + + if (self::ismap($val)) { + $nonDollar = []; + $dollar = []; + foreach ($nodekeys as $nk) { + if (str_contains((string) $nk, self::S_DS)) { + $dollar[] = $nk; + } else { + $nonDollar[] = $nk; + } } + $nodekeys = array_merge($nonDollar, $dollar); + } else { + $nodekeys = self::keysof($val); } - $state->dparent = $current; - // Create a virtual parent holder like TypeScript does - $holder = (object) [self::S_DTOP => $val]; - $state->parent = $holder; - $state->nodes = [$holder]; + for ($nkI = 0; $nkI < count($nodekeys); $nkI++) { + $childinj = $inj->child($nkI, $nodekeys); + $nodekey = $childinj->key; + $childinj->mode = self::M_KEYPRE; + + // Perform the key:pre mode injection on the child key. + $prekey = self::_injectstr($nodekey, $store, $childinj); + + // The injection may modify child processing. + $nkI = $childinj->keyI; + $nodekeys = $childinj->keys; + + // Prevent further processing by returning an undefined prekey + if (self::UNDEF !== $prekey) { + $childinj->val = self::getprop($val, $prekey); + $childinj->mode = self::M_VAL; + + // Perform the val mode injection on the child value. + // NOTE: return value is not used. + self::inject($childinj->val, $store, $childinj); + + // The injection may modify child processing. + $nkI = $childinj->keyI; + $nodekeys = $childinj->keys; + + // Perform the key:post mode injection on the child key. + $childinj->mode = self::M_KEYPOST; + self::_injectstr($nodekey, $store, $childinj); + + // The injection may modify child processing. + $nkI = $childinj->keyI; + $nodekeys = $childinj->keys; + } + + // PHP: arrays are value types; propagate child mutations back to val & parent. + // Skip sync if a transform modified an ancestor (checked via prior chain). + if (is_array($val) && is_array($childinj->parent)) { + // Check that the grandparent (inj->parent) still references our list. + // If a transform like $REF replaced/deleted it, the stored value will differ. + $storedVal = self::getprop($inj->parent, $inj->key); + if (is_array($storedVal)) { + $val = $childinj->parent; + $inj->val = $val; + self::setprop($inj->parent, $inj->key, $val); + } + } + } + } + // Inject paths into string scalars. + else if ($valtype === 'string') { + $inj->mode = self::M_VAL; + $val = self::_injectstr($val, $store, $inj); + if (self::$SKIP !== $val) { + $inj->setval($val); + } } - // Process the value through _injectval - $modifiedVal = self::_injectval($state, $val, $state->dparent ?? $current, $store); - - // For existing injection states, just update and return the modified value - if ($injdef !== null && property_exists($injdef, 'mode')) { - $state->val = $modifiedVal; - return $modifiedVal; + // Custom modification. + if ($inj->modify && self::$SKIP !== $val) { + $mkey = $inj->key; + $mparent = $inj->parent; + $mval = self::getprop($mparent, $mkey); + + call_user_func( + $inj->modify, + $mval, + $mkey, + $mparent, + $inj, + $store + ); } - - // For new injection states, update the holder and return from it - self::setprop($state->parent, self::S_DTOP, $modifiedVal); - return self::getprop($state->parent, self::S_DTOP); + + $inj->val = $val; + + // Original val reference may no longer be correct. + // This return value is only used as the top level result. + return self::getprop($inj->parent, self::S_DTOP); } @@ -1213,6 +1528,8 @@ private static function _injectstr( return self::S_MT; } + $out = $val; + // Pattern examples: "`a.b.c`", "`$NAME`", "`$NAME1`", "``" $m = preg_match('/^`(\$[A-Z]+|[^`]*)[0-9]*`$/', $val, $matches); @@ -1224,64 +1541,47 @@ private static function _injectstr( $pathref = $matches[1]; // Special escapes inside injection. - // Only apply escape handling to strings longer than 3 characters - // to avoid affecting transform command names like $BT (length 3) and $DS (length 2) if (strlen($pathref) > 3) { - // Handle escaped dots FIRST: \. -> . $pathref = str_replace('\\.', '.', $pathref); - // Then handle $BT and $DS $pathref = str_replace('$BT', self::S_BT, $pathref); $pathref = str_replace('$DS', self::S_DS, $pathref); } // Get the extracted path reference. - $current = ($inj !== null && property_exists($inj, 'dparent')) ? $inj->dparent : null; - $out = self::getpath($pathref, $store, $current, $inj); - // When result is a transform (callable), run it via the handler - if ($inj !== null && is_callable($inj->handler) && is_callable($out) && str_starts_with($pathref, self::S_DS)) { - $out = call_user_func($inj->handler, $inj, $out, $pathref, $store); - } - return $out; + $out = self::getpath($store, $pathref, null, $inj); } + else { + // Check for injections within the string. + $out = preg_replace_callback('/`([^`]+)`/', function($matches) use ($store, $inj) { + $ref = $matches[1]; + + if (strlen($ref) > 3) { + $ref = str_replace('\\.', '.', $ref); + $ref = str_replace('$BT', self::S_BT, $ref); + $ref = str_replace('$DS', self::S_DS, $ref); + } + if ($inj !== null) { + $inj->full = false; + } - // Check for injections within the string. - $out = preg_replace_callback('/`([^`]+)`/', function($matches) use ($store, $inj) { - $ref = $matches[1]; + $found = self::getpath($store, $ref, null, $inj); - // Special escapes inside injection. - // Only apply escape handling to strings longer than 3 characters - // to avoid affecting transform command names like $BT (length 3) and $DS (length 2) - if (strlen($ref) > 3) { - // Handle escaped dots FIRST: \. -> . - $ref = str_replace('\\.', '.', $ref); - // Then handle $BT and $DS - $ref = str_replace('$BT', self::S_BT, $ref); - $ref = str_replace('$DS', self::S_DS, $ref); - } - if ($inj !== null) { - $inj->full = false; - } - // Use dparent from injection state as current context for relative path resolution - $current = ($inj !== null && property_exists($inj, 'dparent')) ? $inj->dparent : null; - $found = self::getpath($ref, $store, $current, $inj); + // Ensure inject value is a string. + if ($found === self::UNDEF) { + return self::S_MT; + } + if (is_string($found)) { + return $found; + } + return json_encode($found instanceof ListRef ? self::cloneUnwrap($found) : $found); + }, $val); - // Ensure inject value is a string. - if ($found === self::UNDEF) { - return self::S_MT; - } - if (is_string($found)) { - return $found; + // Also call the inj handler on the entire string, providing the + // option for custom injection. + if ($inj !== null && is_callable($inj->handler)) { + $inj->full = true; + $out = call_user_func($inj->handler, $inj, $out, $val, $store); } - return json_encode($found); - }, $val); - - // Also call the inj handler on the entire string, providing the - // option for custom injection. - if ($inj !== null && is_callable($inj->handler)) { - $inj->full = true; - // Use the extracted pathref if this was a full injection, otherwise original val - $ref = isset($pathref) ? $pathref : $val; - $out = call_user_func($inj->handler, $inj, $out, $ref, $store); } return $out; @@ -1303,18 +1603,18 @@ private static function _injectexpr( } // Otherwise treat it as a path - $result = self::getpath($expr, $store, $current, $state); + $result = self::getpath($store, $expr, $current, $state); return $result; } - private static function _injecthandler( + public static function _injecthandler( object $inj, mixed $val, string $ref, mixed $store ): mixed { $out = $val; - + // Check if val is a function (command transforms) $iscmd = self::isfunc($val) && (self::UNDEF === $ref || str_starts_with($ref, self::S_DS)); @@ -1323,8 +1623,8 @@ private static function _injecthandler( $out = call_user_func($val, $inj, $val, $ref, $store); } // Update parent with value. Ensures references remain in node tree. - elseif (self::S_MVAL === $inj->mode && $inj->full) { - self::setprop($inj->parent, $inj->key, $out); + elseif (self::M_VAL === $inj->mode && $inj->full) { + $inj->setval($val); } return $out; } @@ -1349,7 +1649,7 @@ public static function transform_DELETE( mixed $store ): mixed { // _setparentprop(state, UNDEF) - self::_setparentprop($state, self::UNDEF); + $state->setval(self::UNDEF); return self::UNDEF; } @@ -1363,20 +1663,13 @@ public static function transform_COPY( mixed $ref, mixed $store ): mixed { - $mode = $state->mode; - $key = $state->key; - - $out = $key; - if (!str_starts_with($mode, self::S_MKEY)) { - // For root-level copies where key is "$TOP", return dparent directly - if ($key === self::S_DTOP) { - $out = $state->dparent; - } else { - $out = self::getprop($state->dparent, $key); - } - self::_setparentprop($state, $out); + if (self::M_VAL !== $state->mode) { + return self::UNDEF; } + $out = self::getprop($state->dparent, $state->key); + $state->setval($out); + return $out; } @@ -1392,7 +1685,7 @@ public static function transform_KEY( mixed $store ): mixed { // only in "val" mode do anything - if ($state->mode !== self::S_MVAL) { + if (self::M_VAL !== $state->mode) { return self::UNDEF; } @@ -1448,7 +1741,7 @@ public static function transform_ANNO( * @internal * Merge a list of objects into the current object. */ - public static function transform_MERGE( + public static function transform_MERGE( object $state, mixed $val, mixed $ref, @@ -1458,173 +1751,122 @@ public static function transform_MERGE( $key = $state->key; $parent = $state->parent; - // in key:pre, do all the merge work and remove the key - if ($mode === self::S_MKEYPRE) { - // gather the args under parent[key] - $args = self::getprop($parent, $key); - - // empty-string means "merge top-level store" - if ($args === self::S_MT) { - $args = [self::getprop($state->dparent, self::S_DTOP)]; - } - // coerce single value into array - elseif (!is_array($args)) { - $args = [$args]; - } - - // Resolve each argument to get data values - $resolvedArgs = []; - foreach ($args as $arg) { - if (is_string($arg)) { - // Check if it's an injection string like '`a`' - if (preg_match('/^`(\$[A-Z]+|[^`]*)[0-9]*`$/', $arg, $matches)) { - $pathref = $matches[1]; - // Handle escapes - if (strlen($pathref) > 3) { - $pathref = str_replace('\\.', '.', $pathref); - $pathref = str_replace('$BT', '`', $pathref); - $pathref = str_replace('$DS', '$', $pathref); - } - $resolved = self::getpath($pathref, $store); - } else { - $resolved = $arg; - } - $resolvedArgs[] = $resolved; - } else { - $resolvedArgs[] = $arg; - } - } + // Ensures $MERGE is removed from parent list (val mode). + $out = self::UNDEF; - // remove the $MERGE entry from parent - self::setprop($parent, $key, self::UNDEF); + if (self::M_KEYPRE === $mode) { + $out = $key; + } + // Operate after child values have been transformed. + elseif (self::M_KEYPOST === $mode) { + $out = $key; - // build list: [ parent, ...resolvedArgs, clone(parent) ] - $mergelist = array_merge( - [$parent], - $resolvedArgs, - [clone $parent] - ); + $args = self::getprop($parent, $key); + $args = self::islist($args) ? (($args instanceof ListRef) ? $args->list : $args) : [$args]; - // perform merge - this modifies the parent in place - self::merge($mergelist); + // Remove the $MERGE command from a parent map. + $state->setval(self::UNDEF); - // return UNDEF to prevent further processing of this key - return self::UNDEF; - } + // Literals in the parent have precedence, but we still merge onto + // the parent object, so that node tree references are not changed. + $mergelist = self::flatten([[$parent], $args, [clone $parent]]); - // in key:post, the merge is already done, just return the key - if ($mode === self::S_MKEYPOST) { - return $key; + self::merge($mergelist); } - // otherwise drop it - return self::UNDEF; + return $out; } + public static function transform_EACH( object $state, mixed $_val, string $_ref, mixed $store ): mixed { - // Remove arguments to avoid spurious processing - if (isset($state->keys)) { - $state->keys = array_slice($state->keys, 0, 1); - } + // Remove remaining keys to avoid spurious processing. + $state->keys = array_slice($state->keys, 0, 1); - if (self::S_MVAL !== $state->mode) { + if (self::M_VAL !== $state->mode) { return self::UNDEF; } // Get arguments: ['`$EACH`', 'source-path', child-template] $srcpath = self::getprop($state->parent, 1); $child = self::clone(self::getprop($state->parent, 2)); - - // Source data + + // Source data. $srcstore = self::getprop($store, $state->base, $store); - $src = self::getpath($srcpath, $srcstore, $state); + $src = self::getpath($srcstore, $srcpath, null, $state); - // Create parallel data structures: source entries :: child templates + // Create parallel data structures: source entries :: child templates $tcur = []; $tval = []; $tkey = self::getelem($state->path, -2); $target = self::getelem($state->nodes, -2) ?? self::getelem($state->nodes, -1); - // Create clones of the child template for each value of the current source + // Create clones of the child template for each value of the current source. if (self::islist($src)) { + $srcArr = ($src instanceof ListRef) ? $src->list : (array) $src; $tval = array_map(function($_) use ($child) { return self::clone($child); - }, $src); + }, $srcArr); } elseif (self::ismap($src)) { $tval = []; - foreach ($src as $k => $v) { - $template = self::clone($child); - // Make a note of the key for $KEY transforms - self::setprop($template, self::S_BANNO, (object) [self::S_KEY => $k]); + foreach (self::items($src) as $item) { + $template = self::merge([ + self::clone($child), + (object) [self::S_BANNO => (object) [self::S_KEY => $item[0]]] + ], 1); $tval[] = $template; } } - + $rval = []; - if (count($tval) > 0) { - $tcur = (null == $src) ? self::UNDEF : array_values((array) $src); + if (0 < self::size($tval)) { + $tcur = (null == $src) ? self::UNDEF : ($src instanceof ListRef ? $src->list : array_values((array) $src)); $ckey = self::getelem($state->path, -2); - $tpath = array_slice($state->path, 0, -1); - - // Build dpath like TypeScript: [S_DTOP, ...srcpath.split('.'), '$:' + ckey] - $dpath = [self::S_DTOP]; - $dpath = array_merge($dpath, explode('.', $srcpath), ['$:' . $ckey]); - - // Build parent structure like TypeScript version - $tcur = [$ckey => $tcur]; - - if (count($tpath) > 1) { - $pkey = self::getelem($state->path, -3) ?? self::S_DTOP; - $tcur = [$pkey => $tcur]; + + $tpath = self::slice($state->path, -1); + $dpath = self::flatten([self::S_DTOP, explode(self::S_DT, $srcpath), '$:' . $ckey]); + + // Parent structure. + $tcur = (object) [$ckey => $tcur]; + + if (1 < self::size($tpath)) { + $pkey = self::getelem($state->path, -3, self::S_DTOP); + $tcur = (object) [$pkey => $tcur]; $dpath[] = '$:' . $pkey; } - // Create child injection state matching TypeScript version - $tinj = (object) [ - 'mode' => self::S_MVAL, - 'full' => false, - 'keyI' => 0, - 'keys' => [$ckey], - 'key' => $ckey, - 'val' => $tval, - 'parent' => self::getelem($state->nodes, -1), - 'path' => $tpath, - 'nodes' => array_slice($state->nodes, 0, -1), - 'handler' => [self::class, '_injecthandler'], - 'base' => $state->base, - 'modify' => $state->modify, - 'errs' => $state->errs ?? [], - 'meta' => $state->meta ?? (object) [], - 'dparent' => $tcur, // Use the full nested structure like TypeScript - 'dpath' => $dpath, - ]; - - // Set tval in parent like TypeScript version + $tinj = $state->child(0, [$ckey]); + $tinj->path = $tpath; + $tinj->nodes = self::slice($state->nodes, -1); + + $tinj->parent = self::getelem($tinj->nodes, -1); self::setprop($tinj->parent, $ckey, $tval); - // Inject using the proper injection state - $result = self::inject($tval, $store, $state->modify, $tinj->dparent, $tinj); - + $tinj->val = $tval; + $tinj->dpath = $dpath; + $tinj->dparent = $tcur; + + self::inject($tval, $store, $tinj); $rval = $tinj->val; } - // Update ancestors using the simple approach like TypeScript - self::_updateAncestors($state, $target, $tkey, $rval); + // Update ancestors. + self::setprop($target, $tkey, $rval); // Prevent callee from damaging first list entry (since we are in `val` mode). - return count($rval) > 0 ? $rval[0] : self::UNDEF; + return $rval[0] ?? self::UNDEF; } + /** @internal */ public static function transform_PACK( object $state, @@ -1638,181 +1880,155 @@ public static function transform_PACK( $parent = $state->parent; $nodes = $state->nodes; - // Defensive context checks - only run in key:pre mode - if (self::S_MKEYPRE !== $mode || !is_string($key) || null == $path || null == $nodes) { + // Only run in key:pre mode. + if (self::M_KEYPRE !== $mode) { return self::UNDEF; } - // Get arguments + // Get arguments. $args = self::getprop($parent, $key); if (!is_array($args) || count($args) < 2) { return self::UNDEF; } - $srcpath = $args[0]; // Path to source data - $child = self::clone($args[1]); // Child template + $srcpath = $args[0]; + $origchildspec = self::clone($args[1]); - // Find key and target node - $keyprop = self::getprop($child, self::S_BKEY); + // Find key and target node. $tkey = self::getelem($path, -2); - $target = $nodes[count($path) - 2] ?? $nodes[count($path) - 1]; + $pathsize = self::size($path); + $target = self::getelem($nodes, $pathsize - 2) ?? self::getelem($nodes, $pathsize - 1); // Source data $srcstore = self::getprop($store, $state->base, $store); - $src = self::getpath($srcpath, $srcstore, null, $state); - - // Prepare source as a list - matching TypeScript logic exactly - if (self::islist($src)) { - $src = $src; - } elseif (self::ismap($src)) { - // Transform map to list with KEY annotations like TypeScript - $newSrc = []; - foreach ($src as $k => $node) { - $node = (array) $node; // Ensure it's an array for setprop - $node[self::S_BANNO] = (object) [self::S_KEY => $k]; - $newSrc[] = (object) $node; - } - $src = $newSrc; - } else { - return self::UNDEF; + $src = self::getpath($srcstore, $srcpath, null, $state); + + // Prepare source as a list. + if (!self::islist($src)) { + if (self::ismap($src)) { + $newSrc = []; + foreach (self::items($src) as $item) { + self::setprop($item[1], self::S_BANNO, (object) [self::S_KEY => $item[0]]); + $newSrc[] = $item[1]; + } + $src = $newSrc; + } else { + return self::UNDEF; + } } if (null == $src) { return self::UNDEF; } - // Get key if specified - matching TypeScript logic - $childkey = self::getprop($child, self::S_BKEY); - $keyname = $childkey !== self::UNDEF ? $childkey : $keyprop; - self::delprop($child, self::S_BKEY); + // Get keypath. + $keypath = self::getprop($origchildspec, self::S_BKEY); + $childspec = self::delprop($origchildspec, self::S_BKEY); + + $child = $childspec; - // Build parallel target object using reduce pattern from TypeScript + // Build parallel target object. $tval = new \stdClass(); - foreach ($src as $node) { - $kn = self::getprop($node, $keyname); - if ($kn !== self::UNDEF) { - self::setprop($tval, $kn, self::clone($child)); - $nchild = self::getprop($tval, $kn); - - // Transfer annotation data if present - $mval = self::getprop($node, self::S_BANNO); - if ($mval === self::UNDEF) { - self::delprop($nchild, self::S_BANNO); + + foreach (self::items($src) as $item) { + $srckey = $item[0]; + $srcnode = $item[1]; + + $nkey = $srckey; + if (self::UNDEF !== $keypath) { + if (is_string($keypath) && str_starts_with($keypath, '`')) { + $nkey = self::inject($keypath, self::merge([new \stdClass(), $store, (object) ['$TOP' => $srcnode]], 1)); } else { - self::setprop($nchild, self::S_BANNO, $mval); + $nkey = self::getpath($srcnode, $keypath, null, $state); } } + + $tchild = self::clone($child); + self::setprop($tval, $nkey, $tchild); + + $anno = self::getprop($srcnode, self::S_BANNO); + if (self::UNDEF === $anno) { + self::delprop($tchild, self::S_BANNO); + } else { + self::setprop($tchild, self::S_BANNO, $anno); + } } $rval = new \stdClass(); - if (count((array) $tval) > 0) { - // Build parallel source object - $tcur = new \stdClass(); - foreach ($src as $node) { - $kn = self::getprop($node, $keyname); - if ($kn !== self::UNDEF) { - self::setprop($tcur, $kn, $node); + if (!self::isempty($tval)) { + // Build parallel source object. + $tsrc = new \stdClass(); + foreach ($src as $i => $n) { + $kn = null; + if (self::UNDEF === $keypath) { + $kn = $i; + } elseif (is_string($keypath) && str_starts_with($keypath, '`')) { + $kn = self::inject($keypath, self::merge([new \stdClass(), $store, (object) ['$TOP' => $n]], 1)); + } else { + $kn = self::getpath($n, $keypath, null, $state); } + self::setprop($tsrc, $kn, $n); } - $tpath = array_slice($path, 0, -1); + $tpath = self::slice($state->path, -1); - $ckey = self::getelem($path, -2); - $dpath = [self::S_DTOP]; - if (!empty($srcpath)) { - $dpath = array_merge($dpath, explode('.', $srcpath)); - } - $dpath[] = '$:' . $ckey; + $ckey = self::getelem($state->path, -2); + $dpath = self::flatten([self::S_DTOP, explode(self::S_DT, $srcpath), '$:' . $ckey]); - // Build nested structure like TypeScript using objects, not arrays - $tcur = (object) [$ckey => $tcur]; + $tcur = (object) [$ckey => $tsrc]; - if (count($tpath) > 1) { - $pkey = self::getelem($path, -3) ?? self::S_DTOP; + if (1 < self::size($tpath)) { + $pkey = self::getelem($state->path, -3, self::S_DTOP); $tcur = (object) [$pkey => $tcur]; $dpath[] = '$:' . $pkey; } - // Create child injection state matching TypeScript - $slicedNodes = array_slice($nodes, 0, -1); - $childState = (object) [ - 'mode' => self::S_MVAL, - 'full' => false, - 'keyI' => 0, - 'keys' => [$ckey], - 'key' => $ckey, - 'val' => $tval, - 'parent' => self::getelem($slicedNodes, -1), - 'path' => $tpath, - 'nodes' => $slicedNodes, - 'handler' => [self::class, '_injecthandler'], - 'base' => $state->base, - 'modify' => $state->modify, - 'errs' => $state->errs ?? [], - 'meta' => $state->meta ?? (object) [], - 'dparent' => $tcur, - 'dpath' => $dpath, - ]; - - // Set the value in parent like TypeScript version does - self::setprop($childState->parent, $ckey, $tval); - - // Instead of injecting the entire template at once, - // inject each individual template with its own data context - foreach ((array) $tval as $templateKey => $template) { - // Get the corresponding source node for this template - // $tcur structure may be nested like: {$TOP: {ckey: {K0: sourceNode0, K1: sourceNode1, ...}}} - // Navigate through the structure to find the actual source data - $sourceData = $tcur; - - // If tcur has $TOP level, navigate through it - if (self::getprop($sourceData, self::S_DTOP) !== self::UNDEF) { - $sourceData = self::getprop($sourceData, self::S_DTOP); - } - - // Then navigate to the ckey level - $sourceData = self::getprop($sourceData, $ckey); - - // Finally get the specific source node - $sourceNode = self::getprop($sourceData, $templateKey); - - if ($sourceNode !== self::UNDEF) { - // Create individual injection state for this template - $individualState = clone $childState; - $individualState->dparent = $sourceNode; // Set to individual source node - $individualState->key = $templateKey; - - // Inject this individual template - $injectedTemplate = self::inject($template, $store, $state->modify, $sourceNode, $individualState); - self::setprop($tval, $templateKey, $injectedTemplate); - } - } - - $rval = $tval; + $tinj = $state->child(0, [$ckey]); + $tinj->path = $tpath; + $tinj->nodes = self::slice($state->nodes, -1); + + $tinj->parent = self::getelem($tinj->nodes, -1); + $tinj->val = $tval; + + $tinj->dpath = $dpath; + $tinj->dparent = $tcur; + + self::inject($tval, $store, $tinj); + $rval = $tinj->val; } - // Use _setparentprop to properly set the parent value to the packed data - self::_setparentprop($state, $rval); - // Return UNDEF to signal that this key should be deleted + // Update ancestors. + self::setprop($target, $tkey, $rval); + + // Drop transform key. return self::UNDEF; } + /** @internal */ public static function transform_REF(object $state, mixed $_val, string $_ref, mixed $store): mixed { - if (self::S_MVAL !== $state->mode) { + $nodes = $state->nodes; + + if (self::M_VAL !== $state->mode) { return self::UNDEF; } - $parentVal = self::getprop($state->parent, $state->key); - // Ref path is the second element of the list (parent), not of the current value + + // Get arguments: ['`$REF`', 'ref-path']. $refpath = self::getprop($state->parent, 1); - $state->keyI = self::size($state->keys ?? []); + $state->keyI = self::size($state->keys); + + // Spec reference. $specFn = self::getprop($store, '$SPEC'); $spec = is_callable($specFn) ? $specFn() : self::UNDEF; + $dpath = self::slice($state->path, 1); - $pathState = (object) ['dpath' => $dpath, 'dparent' => self::getpath($dpath, $spec)]; - $ref = self::getpath($refpath, $spec, null, null); + $ref = self::getpath($spec, $refpath, (object) [ + 'dpath' => $dpath, + 'dparent' => self::getpath($spec, $dpath), + ]); + $hasSubRef = false; if (self::isnode($ref)) { self::walk($ref, function ($_k, $v) use (&$hasSubRef) { @@ -1822,55 +2038,59 @@ public static function transform_REF(object $state, mixed $_val, string $_ref, m return $v; }); } - $tref = self::clone($ref); - $pathLen = count($state->path); - $cpath = $pathLen >= 3 ? self::slice($state->path, 0, -2) : []; - $tpath = self::slice($state->path, 0, -1); - $tcur = self::getpath($cpath, $store); - // Resolve current value at path from spec; strip $TOP if present so we resolve relative to spec root - $tpathInSpec = (isset($state->path[0]) && $state->path[0] === self::S_DTOP) - ? self::slice($state->path, 1, -1) : $tpath; - $tval = self::getpath($tpathInSpec, $spec); + + $tref = self::cloneWrap($ref); + + $cpath = self::slice($state->path, -3); + $tpath = self::slice($state->path, -1); + $tcur = self::getpath($store, $cpath); + $tval = self::getpath($store, $tpath); $rval = self::UNDEF; - // Resolve when: no nested $REF, or current path exists in spec, or inside list with scalar ref - $insideListWithScalarRef = isset($state->prior) && !self::isnode($ref); - $shouldResolve = !$hasSubRef || $tval !== self::UNDEF || $insideListWithScalarRef; - if ($shouldResolve) { - $lastKey = self::getelem($tpath, -1); - $tinj = (object) [ - 'mode' => self::S_MVAL, 'key' => $lastKey, - 'parent' => self::getelem($state->nodes, -2), - 'path' => $tpath, 'nodes' => array_slice($state->nodes, 0, -1), - 'val' => $tref, 'dpath' => self::flatten([$cpath]), 'dparent' => $tcur, - 'handler' => $state->handler, 'base' => $state->base, 'modify' => $state->modify, - 'errs' => $state->errs ?? [], 'meta' => $state->meta ?? (object) [], - ]; - $rval = self::inject($tref, $store, $state->modify, $tcur, $tinj); - } - // When ref is scalar and we didn't resolve (e.g. path/tval issue), use ref as value - if ($rval === self::UNDEF && !self::isnode($ref)) { - $rval = $ref; - } - // Set on grandparent (spec) when inside a list so we replace the list key, not the list element. - // When we have prior (list state), the list's container is prior->nodes[1] at prior->path[1] (spec at 'r0'). - if (count($state->path) >= 2) { - $specFn = self::getprop($store, '$SPEC'); - $specToSet = is_callable($specFn) ? $specFn() : self::UNDEF; - $specKey = $state->path[1]; - if ($specToSet !== self::UNDEF && $specKey !== self::UNDEF) { - self::setprop($specToSet, $specKey, $rval); + + if (!$hasSubRef || self::UNDEF !== $tval) { + $tinj = $state->child(0, [self::getelem($tpath, -1)]); + + $tinj->path = $tpath; + $tinj->nodes = self::slice($state->nodes, -1); + $tinj->parent = self::getelem($nodes, -2); + $tinj->val = $tref; + + $tinj->dpath = self::flatten([$cpath]); + $tinj->dparent = $tcur; + + $injResult = self::inject($tref, $store, $tinj); + + // If inject returned SKIP, use tref (mutated in place) not tinj->val (which may be SKIP) + if ($injResult === self::$SKIP || $tinj->val === self::$SKIP) { + $rval = is_object($tref) ? $tref : self::UNDEF; } else { - self::_setval($state, $rval, 0); + $rval = $tinj->val; } } else { - self::_setval($state, $rval, 0); + $rval = self::UNDEF; } - if (isset($state->prior)) { + + $grandparent = $state->setval($rval, 2); + + // PHP: arrays in nodes are copies, so ancestor setval on arrays doesn't propagate. + // Sync the prior injection's parent if it's an array. + if ($state->prior && is_array($state->prior->parent)) { + $akey = self::getelem($state->path, -2); + if (self::UNDEF === $rval) { + $state->prior->parent = self::delprop($state->prior->parent, $akey); + } else { + self::setprop($state->prior->parent, $akey, $rval); + } + } + + if (self::islist($grandparent) && $state->prior) { $state->prior->keyI--; } - return self::$SKIP; + + return $_val; } + /** * Transform data using a spec. * @@ -1882,17 +2102,36 @@ public static function transform_REF(object $state, mixed $_val, string $_ref, m public static function transform( mixed $data, mixed $spec, - mixed $extra = null, - ?callable $modify = null + mixed $injdef = null ): mixed { - // 1) clone spec so we can mutate it - $specClone = self::clone($spec); + // Support injdef object pattern or backward compat (extra data passed directly) + $extra = null; + $modify = null; + $errs = null; + if (is_object($injdef) && ( + property_exists($injdef, 'extra') || + property_exists($injdef, 'modify') || + property_exists($injdef, 'errs') || + property_exists($injdef, 'meta') || + property_exists($injdef, 'handler') + )) { + // New injdef pattern: { extra, modify, errs, meta, handler } + $extra = property_exists($injdef, 'extra') ? $injdef->extra : null; + $modify = property_exists($injdef, 'modify') ? $injdef->modify : null; + $errs = property_exists($injdef, 'errs') ? $injdef->errs : null; + } else { + // Backward compat: treat 3rd arg as extra data/store directly + $extra = $injdef; + } + + // 1) clone spec, wrapping arrays in ListRef for reference stability (Go pattern) + $specClone = self::cloneWrap($spec); // 2) split extra into data vs transforms $extraTransforms = []; $extraData = []; - foreach ((array) $extra as $k => $v) { + foreach ((array) ($extra ?? []) as $k => $v) { if (str_starts_with((string) $k, self::S_DS)) { $extraTransforms[$k] = $v; } else { @@ -1902,8 +2141,8 @@ public static function transform( // 3) build the combined store $dataClone = self::merge([ - self::clone($extraData), - self::clone($data), + self::cloneWrap($extraData), + self::cloneWrap($data), ]); $store = (object) array_merge( @@ -1920,36 +2159,53 @@ public static function transform( '$MERGE' => [self::class, 'transform_MERGE'], '$EACH' => [self::class, 'transform_EACH'], '$PACK' => [self::class, 'transform_PACK'], - '$SPEC' => fn() => $specClone, + '$SPEC' => fn() => $spec, '$REF' => [self::class, 'transform_REF'], ], $extraTransforms ); // 4) run inject to do the transform - $result = self::inject($specClone, $store, $modify, $dataClone); + $injectOpts = new \stdClass(); + if ($modify !== null) { + $injectOpts->modify = $modify; + } + if (is_object($injdef) && property_exists($injdef, 'handler') && $injdef->handler !== null) { + $injectOpts->handler = $injdef->handler; + } + if (is_object($injdef) && property_exists($injdef, 'meta') && $injdef->meta !== null) { + $injectOpts->meta = $injdef->meta; + } + if (is_object($injdef) && property_exists($injdef, 'errs') && $injdef->errs !== null) { + $injectOpts->errs = $injdef->errs; + } + $result = self::inject($specClone, $store, $injectOpts); // When a child transform (e.g. $REF) deletes the key, inject returns SKIP; return mutated spec if ($result === self::$SKIP) { - return $specClone; + return self::cloneUnwrap($specClone); } - return $result; - } - /** @internal */ - private static function _setparentprop(object $state, mixed $val): void { - if ($val === self::UNDEF) { - self::delprop($state->parent, $state->key); - } else { - self::setprop($state->parent, $state->key, $val); - } + return self::cloneUnwrap($result); } - /** @internal */ - private static function _updateAncestors(object $_state, mixed &$target, mixed $tkey, mixed $tval): void - { - // In TS this simply re-writes the transformed value into its ancestor - self::setprop($target, $tkey, $tval); + /** + * Remove unresolved $REF list entries from a list spec. + * This handles PHP's value-type arrays where in-place mutation via references doesn't propagate. + */ + private static function _cleanRefEntries(array $list): array { + $cleaned = []; + foreach ($list as $item) { + if (self::islist($item) && count($item) >= 1 && self::getprop($item, 0) === '`$REF`') { + // This is an unresolved $REF entry - remove it + continue; + } + if (self::islist($item)) { + $item = self::_cleanRefEntries($item); + } + $cleaned[] = $item; + } + return $cleaned; } /** @internal */ @@ -1965,28 +2221,6 @@ private static function _invalidTypeMsg(array $path, string $needtype, int $vt, /* ======================= * Validation Functions * ======================= - */ - - /** - * Helper function to set a value in injection state, equivalent to TypeScript's setval method - */ - private static function _setval(object $inj, mixed $val, int $ancestor = 0): void - { - if ($ancestor === 0) { - self::setprop($inj->parent, $inj->key, $val); - } else { - // Navigate up the ancestor chain - $targetIndex = count($inj->nodes) + $ancestor; - if ($targetIndex >= 0 && $targetIndex < count($inj->nodes)) { - $targetNode = $inj->nodes[$targetIndex]; - $pathIndex = count($inj->path) + $ancestor; - if ($pathIndex >= 0 && $pathIndex < count($inj->path)) { - $targetKey = $inj->path[$pathIndex]; - self::setprop($targetNode, $targetKey, $val); - } - } - } - } /** * A required string value. @@ -2091,6 +2325,25 @@ public static function validate_FUNCTION(object $inj): mixed return $out; } + /** + * Generic type validator. Validates against any type name via TYPENAME lookup. + */ + public static function validate_TYPE(object $inj, mixed $_val = null, ?string $ref = null): mixed + { + $tname = strtolower(substr($ref ?? '', 1)); + $idx = array_search($tname, self::TYPENAME); + $typev = ($idx !== false) ? (1 << (31 - $idx)) : 0; + $out = self::getprop($inj->dparent, $inj->key); + + $t = self::typify($out); + if (0 === ($t & $typev)) { + $inj->errs[] = self::_invalidTypeMsg($inj->path, $tname, $t, $out); + return self::UNDEF; + } + + return $out; + } + /** * Allow any value. */ @@ -2114,7 +2367,7 @@ public static function validate_CHILD(object $inj): mixed $path = $inj->path; // Map syntax. - if (self::S_MKEYPRE === $mode) { + if (self::M_KEYPRE === $mode) { $childtm = self::getprop($parent, $key); // Get corresponding current object. @@ -2139,12 +2392,12 @@ public static function validate_CHILD(object $inj): mixed $inj->keys = $keys; // Remove $CHILD to cleanup output. - self::_setval($inj, self::UNDEF); + $inj->setval(self::UNDEF); return self::UNDEF; } // List syntax. - if (self::S_MVAL === $mode) { + if (self::M_VAL === $mode) { if (!self::islist($parent)) { // $CHILD was not inside a list. $inj->errs[] = 'Invalid $CHILD as value'; @@ -2200,7 +2453,7 @@ public static function validate_ONE( $keyI = $inj->keyI; // Only operate in val mode, since parent is a list. - if (self::S_MVAL === $mode) { + if (self::M_VAL === $mode) { if (!self::islist($parent) || 0 !== $keyI) { $inj->errs[] = 'The $ONE validator at field ' . self::pathify($inj->path, 1, 1) . @@ -2211,7 +2464,7 @@ public static function validate_ONE( $inj->keyI = count($inj->keys ?? []); // Clean up structure, replacing [$ONE, ...] with current - self::_setval($inj, $inj->dparent, -2); + $inj->setval($inj->dparent, 2); $inj->path = self::slice($inj->path, 0, -1); $inj->key = self::getelem($inj->path, -1); @@ -2237,7 +2490,7 @@ public static function validate_ONE( 'meta' => $inj->meta, ]); - self::_setval($inj, $vcurrent, -2); + $inj->setval($vcurrent, 2); // Accept current value if there was a match if (0 === count($terrs)) { @@ -2246,9 +2499,8 @@ public static function validate_ONE( } // There was no match. - $valdesc = implode(', ', array_map(function($v) { - return self::stringify($v); - }, $tvals)); + $tvArr = ($tvals instanceof ListRef) ? $tvals->list : (is_array($tvals) ? $tvals : []); + $valdesc = implode(', ', array_map(fn($v) => self::stringify($v), $tvArr)); $valdesc = preg_replace(self::R_TRANSFORM_NAME, '$1', strtolower($valdesc)); $inj->errs[] = self::_invalidTypeMsg( @@ -2271,7 +2523,7 @@ public static function validate_EXACT(object $inj): mixed $keyI = $inj->keyI; // Only operate in val mode, since parent is a list. - if (self::S_MVAL === $mode) { + if (self::M_VAL === $mode) { if (!self::islist($parent) || 0 !== $keyI) { $inj->errs[] = 'The $EXACT validator at field ' . self::pathify($inj->path, 1, 1) . @@ -2282,7 +2534,7 @@ public static function validate_EXACT(object $inj): mixed $inj->keyI = count($inj->keys ?? []); // Clean up structure, replacing [$EXACT, ...] with current data parent - self::_setval($inj, $inj->dparent, -2); + $inj->setval($inj->dparent, 2); $inj->path = self::slice($inj->path, 0, count($inj->path) - 1); $inj->key = self::getelem($inj->path, -1); @@ -2311,9 +2563,8 @@ public static function validate_EXACT(object $inj): mixed } } - $valdesc = implode(', ', array_map(function($v) { - return self::stringify($v); - }, $tvals)); + $tvArr = ($tvals instanceof ListRef) ? $tvals->list : (is_array($tvals) ? $tvals : []); + $valdesc = implode(', ', array_map(fn($v) => self::stringify($v), $tvArr)); $valdesc = preg_replace(self::R_TRANSFORM_NAME, '$1', strtolower($valdesc)); $inj->errs[] = self::_invalidTypeMsg( @@ -2348,7 +2599,7 @@ private static function _validation( } // select needs exact matches - $exact = self::getprop($inj->meta ?? (object) [], '`$EXACT`'); + $exact = self::getprop($inj->meta, '`$EXACT`'); // Current val to verify. $cval = self::getprop($inj->dparent, $key); @@ -2433,9 +2684,9 @@ private static function _validatehandler( if ($ismetapath) { if ('=' === $matches[2]) { - self::_setval($inj, ['`$EXACT`', $val]); + $inj->setval(['`$EXACT`', $val]); } else { - self::_setval($inj, $val); + $inj->setval($val); } $inj->keyI = -1; @@ -2474,11 +2725,16 @@ public static function validate(mixed $data, mixed $spec, mixed $injdef = null): '$PACK' => null, '$STRING' => [self::class, 'validate_STRING'], - '$NUMBER' => [self::class, 'validate_NUMBER'], - '$BOOLEAN' => [self::class, 'validate_BOOLEAN'], - '$OBJECT' => [self::class, 'validate_OBJECT'], - '$ARRAY' => [self::class, 'validate_ARRAY'], - '$FUNCTION' => [self::class, 'validate_FUNCTION'], + '$NUMBER' => [self::class, 'validate_TYPE'], + '$INTEGER' => [self::class, 'validate_TYPE'], + '$DECIMAL' => [self::class, 'validate_TYPE'], + '$BOOLEAN' => [self::class, 'validate_TYPE'], + '$NULL' => [self::class, 'validate_TYPE'], + '$NIL' => [self::class, 'validate_TYPE'], + '$MAP' => [self::class, 'validate_TYPE'], + '$LIST' => [self::class, 'validate_TYPE'], + '$FUNCTION' => [self::class, 'validate_TYPE'], + '$INSTANCE' => [self::class, 'validate_TYPE'], '$ANY' => [self::class, 'validate_ANY'], '$CHILD' => [self::class, 'validate_CHILD'], '$ONE' => [self::class, 'validate_ONE'], @@ -2490,7 +2746,15 @@ public static function validate(mixed $data, mixed $spec, mixed $injdef = null): $meta = is_object($injdef) && property_exists($injdef, 'meta') ? $injdef->meta : null; - $out = self::transform($data, $spec, $store, [self::class, '_validation']); + $transformOpts = new \stdClass(); + $transformOpts->extra = $store; + $transformOpts->modify = [self::class, '_validation']; + $transformOpts->handler = [self::class, '_validatehandler']; + if ($meta !== null) { + $transformOpts->meta = $meta; + } + $transformOpts->errs = $errs; + $out = self::transform($data, $spec, $transformOpts); $generr = (0 < count($errs) && !$collect); if ($generr) { @@ -2509,7 +2773,7 @@ public static function validate(mixed $data, mixed $spec, mixed $injdef = null): * @param mixed $children The object or array to search in * @return array Array of matching children */ - public static function select(mixed $query, mixed $children): array + public static function select(mixed $children, mixed $query): array { if (!self::isnode($children)) { return []; @@ -2536,10 +2800,12 @@ public static function select(mixed $query, mixed $children): array 'extra' => [ '$AND' => [self::class, 'select_AND'], '$OR' => [self::class, 'select_OR'], + '$NOT' => [self::class, 'select_NOT'], '$GT' => [self::class, 'select_CMP'], '$LT' => [self::class, 'select_CMP'], '$GTE' => [self::class, 'select_CMP'], '$LTE' => [self::class, 'select_CMP'], + '$LIKE' => [self::class, 'select_CMP'], ] ]; @@ -2569,7 +2835,7 @@ public static function select(mixed $query, mixed $children): array */ private static function select_AND(object $state, mixed $val, mixed $current, string $ref, mixed $store): mixed { - if (self::S_MKEYPRE === $state->mode) { + if (self::M_KEYPRE === $state->mode) { $terms = self::getprop($state->parent, $state->key); $src = self::getprop($store, $state->base, $store); @@ -2594,7 +2860,7 @@ private static function select_AND(object $state, mixed $val, mixed $current, st */ private static function select_OR(object $state, mixed $val, mixed $current, string $ref, mixed $store): mixed { - if (self::S_MKEYPRE === $state->mode) { + if (self::M_KEYPRE === $state->mode) { $terms = self::getprop($state->parent, $state->key); $src = self::getprop($store, $state->base, $store); @@ -2616,39 +2882,76 @@ private static function select_OR(object $state, mixed $val, mixed $current, str return null; } + /** + * Helper method for $NOT operator in select queries + */ + private static function select_NOT(object $state, mixed $_val, mixed $_ref, mixed $store): mixed + { + if (self::M_KEYPRE === $state->mode) { + $term = self::getprop($state->parent, $state->key); + + $ppath = self::slice($state->path, -1); + $point = self::getpath($store, $ppath); + + $vstore = self::merge([(object) [], $store], 1); + $vstore->{'$TOP'} = $point; + + $terrs = []; + self::validate($point, $term, (object) [ + 'extra' => $vstore, + 'errs' => $terrs, + 'meta' => $state->meta, + ]); + + if (count($terrs) === 0) { + $state->errs[] = 'NOT:' . self::pathify($ppath) . ': ' . self::stringify($point) . ' fail:' . self::stringify($term); + } + + $gkey = self::getelem($state->path, -2); + $gp = self::getelem($state->nodes, -2); + self::setprop($gp, $gkey, $point); + } + return null; + } + /** * Helper method for comparison operators in select queries */ private static function select_CMP(object $state, mixed $_val, string $ref, mixed $store): mixed { - if (self::S_MKEYPRE === $state->mode) { + if (self::M_KEYPRE === $state->mode) { $term = self::getprop($state->parent, $state->key); - $src = self::getprop($store, $state->base, $store); $gkey = self::getelem($state->path, -2); - $tval = self::getprop($src, $gkey); + $ppath = self::slice($state->path, -1); + $point = self::getpath($store, $ppath); + $pass = false; - if ('$GT' === $ref && $tval > $term) { + if ('$GT' === $ref && $point > $term) { + $pass = true; + } + elseif ('$LT' === $ref && $point < $term) { $pass = true; } - else if ('$LT' === $ref && $tval < $term) { + elseif ('$GTE' === $ref && $point >= $term) { $pass = true; } - else if ('$GTE' === $ref && $tval >= $term) { + elseif ('$LTE' === $ref && $point <= $term) { $pass = true; } - else if ('$LTE' === $ref && $tval <= $term) { + elseif ('$LIKE' === $ref && preg_match('/' . $term . '/', self::stringify($point))) { $pass = true; } if ($pass) { // Update spec to match found value so that _validate does not complain $gp = self::getelem($state->nodes, -2); - self::setprop($gp, $gkey, $tval); + self::setprop($gp, $gkey, $point); } else { - $state->errs[] = 'CMP: fail:' . $ref . ' ' . self::stringify($term); + $state->errs[] = 'CMP: ' . self::pathify($ppath) . ': ' . self::stringify($point) . + ' fail:' . $ref . ' ' . self::stringify($term); } } return null; @@ -2668,22 +2971,24 @@ public static function getelem(mixed $val, mixed $key, mixed $alt = self::UNDEF) } if (self::islist($val)) { + $listArr = ($val instanceof ListRef) ? $val->list : $val; + $listLen = count($listArr); if (is_string($key)) { if (!preg_match('/^[-0-9]+$/', $key)) { $out = self::UNDEF; } else { $nkey = (int) $key; if ($nkey < 0) { - $nkey = count($val) + $nkey; + $nkey = $listLen + $nkey; } - $out = array_key_exists($nkey, $val) ? $val[$nkey] : self::UNDEF; + $out = ($nkey >= 0 && $nkey < $listLen) ? $listArr[$nkey] : self::UNDEF; } } elseif (is_int($key)) { $nkey = $key; if ($nkey < 0) { - $nkey = count($val) + $nkey; + $nkey = $listLen + $nkey; } - $out = array_key_exists($nkey, $val) ? $val[$nkey] : self::UNDEF; + $out = ($nkey >= 0 && $nkey < $listLen) ? $listArr[$nkey] : self::UNDEF; } } @@ -2710,11 +3015,22 @@ public static function delprop(mixed $parent, mixed $key): mixed return $parent; } + if ($parent instanceof ListRef) { + $keyI = (int)$key; + if (!is_numeric($key)) { + return $parent; + } + if ($keyI >= 0 && $keyI < count($parent->list)) { + array_splice($parent->list, $keyI, 1); + } + return $parent; + } + if (self::ismap($parent)) { $key = self::strkey($key); unset($parent->$key); } - else if (self::islist($parent)) { + elseif (self::islist($parent)) { // Ensure key is an integer $keyI = (int)$key; if (!is_numeric($key) || (string)$keyI !== (string)$key) { @@ -2733,159 +3049,6 @@ public static function delprop(mixed $parent, mixed $key): mixed return $parent; } - private static function _injectval( - object $state, - mixed $val, - mixed $current, - mixed $store - ): mixed { - $valtype = gettype($val); - - // Descend into node (arrays and objects) - if (self::isnode($val)) { - // Check if this object has been replaced by a PACK transform - if (self::ismap($val) && self::getprop($val, '__PACK_REPLACED__') === true) { - // The parent structure has been replaced, skip processing this object - // But first, clean up the marker so it doesn't appear in the final output - self::delprop($val, '__PACK_REPLACED__'); - return $val; - } - - // Keys are sorted alphanumerically to ensure determinism. - // Injection transforms ($FOO) are processed *after* other keys. - if (self::ismap($val)) { - $allKeys = array_keys((array) $val); - $normalKeys = []; - $transformKeys = []; - - foreach ($allKeys as $k) { - if (str_contains((string) $k, self::S_DS)) { - $transformKeys[] = $k; - } else { - $normalKeys[] = $k; - } - } - - sort($normalKeys); - sort($transformKeys); - $nodekeys = array_merge($normalKeys, $transformKeys); - } else { - // For lists, keys are just the indices - important: use indices as integers like TypeScript - $nodekeys = array_keys($val); - } - - // Each child key-value pair is processed in three injection phases: - // 1. mode='key:pre' - Key string is injected, returning a possibly altered key. - // 2. mode='val' - The child value is injected. - // 3. mode='key:post' - Key string is injected again, allowing child mutation. - $childReturnedSkip = false; - for ($nkI = 0; $nkI < count($nodekeys); $nkI++) { - $nodekey = $nodekeys[$nkI]; - - // Create child injection state - $childpath = array_merge($state->path, [self::strkey($nodekey)]); - $childnodes = array_merge($state->nodes, [$val]); - $childval = self::getprop($val, $nodekey); - - // Calculate the child data context (dparent) - // Only descend into data properties when the spec value is a nested object - // This allows relative paths to work while keeping simple injections at the right level - $child_dparent = $state->dparent; - if ($child_dparent !== self::UNDEF && $child_dparent !== null && self::isnode($childval)) { - $child_dparent = self::getprop($child_dparent, self::strkey($nodekey)); - } - - $childinj = (object) [ - 'mode' => self::S_MKEYPRE, - 'full' => false, - 'keyI' => $nkI, - 'keys' => $nodekeys, - 'key' => self::strkey($nodekey), - 'val' => $childval, - 'parent' => $val, - 'path' => $childpath, - 'nodes' => $childnodes, - 'handler' => $state->handler, - 'base' => $state->base, - 'modify' => $state->modify, - 'errs' => $state->errs ?? [], - 'meta' => $state->meta ?? (object) [], - 'dparent' => $child_dparent, - 'dpath' => isset($state->dpath) ? array_merge($state->dpath, [self::strkey($nodekey)]) : [self::strkey($nodekey)], - 'prior' => $state, - ]; - - // Perform the key:pre mode injection on the child key. - $prekey = self::_injectstr(self::strkey($nodekey), $store, $childinj); - - // The injection may modify child processing. - $nkI = max(0, $childinj->keyI); - $nodekeys = $childinj->keys; - - // If prekey is UNDEF, delete the key and skip further processing - if ($prekey === self::UNDEF) { - // Delete the key from the parent - self::delprop($val, $nodekey); - - // Remove this key from the nodekeys array to prevent issues with iteration - array_splice($nodekeys, $nkI, 1); - $nkI--; // Adjust index since we removed an element - continue; - } - - // Continue with normal processing - $childinj->val = self::getprop($val, $prekey); - $childinj->mode = self::S_MVAL; - - // Perform the val mode injection on the child value. - // Pass the child injection state to maintain context - $injected_result = self::inject($childinj->val, $store, $state->modify, $childinj->dparent, $childinj); - if ($injected_result === self::$SKIP) { - $childReturnedSkip = true; - } else { - self::setprop($val, $nodekey, $injected_result); - } - - // The injection may modify child processing. - $nkI = max(0, $childinj->keyI); - $nodekeys = $childinj->keys; - - // Perform the key:post mode injection on the child key. - $childinj->mode = self::S_MKEYPOST; - self::_injectstr(self::strkey($nodekey), $store, $childinj); - - // The injection may modify child processing. - $nkI = max(0, $childinj->keyI); - $nodekeys = $childinj->keys; - } - - if ($childReturnedSkip) { - return self::$SKIP; - } - } - // Inject paths into string scalars. - else if ($valtype === 'string') { - $state->mode = self::S_MVAL; - $val = self::_injectstr($val, $store, $state); - if ($val !== self::$SKIP) { // PHP equivalent of SKIP check - self::setprop($state->parent, $state->key, $val); - } - } - - // Custom modification - if ($state->modify) { - $mkey = $state->key; - $mparent = $state->parent; - $mval = self::getprop($mparent, $mkey); - call_user_func($state->modify, $mval, $mkey, $mparent, $state, $current, $store); - // Return the value after modify (callback may have updated parent) - $val = self::getprop($mparent, $mkey); - } - - $state->val = $val; - - return $val; - } public static function setpath( mixed $store, @@ -2928,13 +3091,166 @@ public static function setpath( } } -?>op($parent, self::getelem($parts, -1)); - } else { - self::setprop($parent, self::getelem($parts, -1), $val); + + +class Injection +{ + private const MODENAME = [ + Struct::M_VAL => 'val', + Struct::M_KEYPRE => 'key:pre', + Struct::M_KEYPOST => 'key:post', + ]; + + public int $mode; + public bool $full; + public int $keyI; + public array $keys; + public string $key; + public mixed $val; + public mixed $parent; + public array $path; + public array $nodes; + /** @var callable */ + public mixed $handler; + public array $errs; + public object $meta; + public mixed $dparent; + public array $dpath; + public string $base; + /** @var callable|null */ + public mixed $modify; + public ?Injection $prior; + public mixed $extra; + + public function __construct(mixed $val, mixed $parent) + { + $this->val = $val; + $this->parent = $parent; + $this->errs = []; + + $this->dparent = Struct::UNDEF; + $this->dpath = ['$TOP']; + + $this->mode = Struct::M_VAL; + $this->full = false; + $this->keyI = 0; + $this->keys = ['$TOP']; + $this->key = '$TOP'; + $this->path = ['$TOP']; + $this->nodes = [$parent]; + $this->handler = [Struct::class, '_injecthandler']; + $this->base = '$TOP'; + $this->meta = (object) []; + $this->modify = null; + $this->prior = null; + $this->extra = null; + } + + + public function __toString(): string + { + return $this->toString(); + } + + public function toString(?string $prefix = null): string + { + return 'INJ' . (null === $prefix ? '' : '/' . $prefix) . ':' . + Struct::pad(Struct::pathify($this->path, 1)) . + (self::MODENAME[$this->mode] ?? '') . ($this->full ? '/full' : '') . ':' . + 'key=' . $this->keyI . '/' . $this->key . '/' . '[' . implode(',', $this->keys) . ']' . + ' p=' . Struct::stringify($this->parent, -1, 1) . + ' m=' . Struct::stringify($this->meta, -1, 1) . + ' d/' . Struct::pathify($this->dpath, 1) . '=' . Struct::stringify($this->dparent, -1, 1) . + ' r=' . Struct::stringify(Struct::getprop($this->nodes[0] ?? null, '$TOP'), -1, 1); + } + + + public function descend(): mixed + { + if (!isset($this->meta->__d)) { + $this->meta->__d = 0; } + $this->meta->__d++; + $parentkey = Struct::getelem($this->path, -2); - return $parent; + // Resolve current node in store for local paths. + if (Struct::UNDEF === $this->dparent) { + + // Even if there's no data, dpath should continue to match path, so that + // relative paths work properly. + if (1 < Struct::size($this->dpath)) { + $this->dpath = Struct::flatten([$this->dpath, $parentkey]); + } + } + else { + // this->dparent is the containing node of the current store value. + if (null !== $parentkey && Struct::UNDEF !== $parentkey) { + $this->dparent = Struct::getprop($this->dparent, $parentkey); + + $lastpart = Struct::getelem($this->dpath, -1); + if ($lastpart === '$:' . $parentkey) { + $this->dpath = Struct::slice($this->dpath, -1); + } + else { + $this->dpath = Struct::flatten([$this->dpath, $parentkey]); + } + } + } + + return $this->dparent; } + + public function child(int $keyI, array $keys): Injection + { + $key = Struct::strkey($keys[$keyI] ?? null); + $val = $this->val; + + $cinj = new Injection(Struct::getprop($val, $key), $val); + $cinj->keyI = $keyI; + $cinj->keys = $keys; + $cinj->key = $key; + + $cinj->path = Struct::flatten([Struct::getdef($this->path, []), $key]); + $cinj->nodes = Struct::flatten([Struct::getdef($this->nodes, []), [$val]]); + + $cinj->mode = $this->mode; + $cinj->handler = $this->handler; + $cinj->modify = $this->modify; + $cinj->base = $this->base; + $cinj->meta = $this->meta; + $cinj->errs = &$this->errs; + $cinj->prior = $this; + + $cinj->dpath = Struct::flatten([$this->dpath]); + $cinj->dparent = $this->dparent; + + return $cinj; + } + + + public function setval(mixed $val, ?int $ancestor = null): mixed + { + $parent = Struct::UNDEF; + if (null === $ancestor || $ancestor < 2) { + if (Struct::UNDEF === $val) { + $this->parent = Struct::delprop($this->parent, $this->key); + $parent = $this->parent; + } else { + $parent = Struct::setprop($this->parent, $this->key, $val); + } + } + else { + $aval = Struct::getelem($this->nodes, 0 - $ancestor); + $akey = Struct::getelem($this->path, 0 - $ancestor); + if (Struct::UNDEF === $val) { + $parent = Struct::delprop($aval, $akey); + } else { + $parent = Struct::setprop($aval, $akey, $val); + } + } + + return $parent; + } } ?> \ No newline at end of file diff --git a/php/tests/Runner.php b/php/tests/Runner.php index 121a54d4..eb05687e 100644 --- a/php/tests/Runner.php +++ b/php/tests/Runner.php @@ -202,7 +202,7 @@ private static function resolveTestPack(string $name, $entry, $subject, $client, private static function match($check, $base, $structUtils): void { $structUtils->walk($check, function ($key, $val, $parent, $path) use ($base, $structUtils) { if (!is_array($val) && !is_object($val)) { - $baseval = $structUtils->getpath($path, $base); + $baseval = $structUtils->getpath($base, $path); if ($baseval === $val) { return; } diff --git a/php/tests/StructTest.php b/php/tests/StructTest.php index 4ecac515..4550afc0 100644 --- a/php/tests/StructTest.php +++ b/php/tests/StructTest.php @@ -388,8 +388,8 @@ function ($input) { return $val(); }; return Struct::getpath( - $input->path, $store, + $input->path, null, $state ); @@ -577,7 +577,7 @@ public function testGetpathBasic(): void function ($input) { $path = property_exists($input, 'path') ? $input->path : Struct::UNDEF; $store = property_exists($input, 'store') ? $input->store : Struct::UNDEF; - $result = Struct::getpath($path, $store); + $result = Struct::getpath($store, $path); return $result; }, true @@ -598,7 +598,7 @@ function ($input) { if (property_exists($input, 'dpath')) { $state->dpath = explode('.', $input->dpath); } - $result = Struct::getpath($path, $store, null, $state); + $result = Struct::getpath($store, $path, null, $state); return $result; }, true @@ -613,7 +613,7 @@ function ($input) { $path = property_exists($input, 'path') ? $input->path : Struct::UNDEF; $store = property_exists($input, 'store') ? $input->store : Struct::UNDEF; $state = property_exists($input, 'inj') ? $input->inj : null; - $result = Struct::getpath($path, $store, null, $state); + $result = Struct::getpath($store, $path, null, $state); return $result; }, true @@ -640,7 +640,7 @@ public function testInjectBasic(): void public function testInjectString(): void { // a no-op modifier for string‐only tests - $nullModifier = function ($v, $k, $p, $state, $current, $store) { + $nullModifier = function ($v, $k = null, $p = null, $state = null, $store = null) { // do nothing return $v; }; @@ -648,9 +648,9 @@ public function testInjectString(): void $this->testSet( $this->testSpec->inject->string, function (stdClass $in) use ($nullModifier) { - // some specs may include a 'current' key - $current = property_exists($in, 'current') ? $in->current : null; - return Struct::inject($in->val, $in->store, $nullModifier, $current); + $opts = new \stdClass(); + $opts->modify = $nullModifier; + return Struct::inject($in->val, $in->store, $opts); }, /* force deep‐equal */ true ); @@ -732,15 +732,17 @@ public function testTransformModify(): void $this->testSet( $this->testSpec->transform->modify, function (object $vin) { + $opts = new \stdClass(); + $opts->extra = property_exists($vin, 'store') ? $vin->store : (object) []; + $opts->modify = function ($val, $key, $parent) { + if ($key !== null && $parent !== null && is_string($val)) { + Struct::setprop($parent, $key, '@' . $val); + } + }; return Struct::transform( $vin->data, $vin->spec, - property_exists($vin, 'store') ? $vin->store : (object) [], - function ($val, $key, $parent) { - if ($key !== null && $parent !== null && is_string($val)) { - Struct::setprop($parent, $key, '@' . $val); - } - } + $opts ); } ); @@ -820,15 +822,18 @@ public function testValidateExact(): void public function testValidateInvalid(): void { + $count = 0; $this->testSet( $this->testSpec->validate->invalid, - function ($input) { + function ($input) use (&$count) { + $count++; return Struct::validate( property_exists($input, 'data') ? $input->data : (object) [], property_exists($input, 'spec') ? $input->spec : (object) [] ); } ); + $this->assertGreaterThan(0, $count, 'validate-invalid should have run at least one test entry'); } public function testValidateSpecial(): void diff --git a/py/NOTES.md b/py/NOTES.md new file mode 100644 index 00000000..e48a91fc --- /dev/null +++ b/py/NOTES.md @@ -0,0 +1,19 @@ +# Python Implementation Notes + +## undefined vs null + +Python has only `None` — there is no native distinction between "absent" and "null". +For this library: +- `None` is used to represent **property absence** (the TypeScript `undefined` equivalent). +- TypeScript tests relating to `undefined` should be treated as **property absence**: the key + does not exist in the dict, or the function parameter was not provided. +- JSON null is ambiguous with `None`. Where the distinction matters, the test runner uses + marker strings: `NULLMARK = '__NULL__'` for JSON null and `UNDEFMARK = '__UNDEF__'` for absent values. +- In practice, most APIs do not use JSON null, so this ambiguity rarely causes issues. + +## Type System + +This implementation uses bitfield integers for the type system, matching the TypeScript canonical. +Type constants (`T_any`, `T_noval`, `T_boolean`, etc.) are exported and `typify()` returns +integer bitfields. Use `typename()` to get the human-readable name for error messages. +Bitwise operations allow composite type checks (e.g., `T_scalar | T_string`). diff --git a/py/REVIEW.md b/py/REVIEW.md new file mode 100644 index 00000000..d3fee009 --- /dev/null +++ b/py/REVIEW.md @@ -0,0 +1,143 @@ +# Python (py) - Review vs TypeScript Canonical + +## Overview + +The Python version is one of the most complete implementations, with **39 exported functions** closely matching the TypeScript canonical. It uses the unified `injdef` parameter pattern and has a full `InjectState` class. The main gaps are minor naming differences and a few missing utilities. + +--- + +## Missing Functions + +| Function | Category | Impact | +|----------|----------|--------| +| `replace` | String | No unified string replace wrapper | +| `getdef` | Property access | Not exported (may exist internally) | + +--- + +## Naming Differences + +| TS Name | Python Name | Notes | +|---------|-------------|-------| +| `jm` | `jo` | JSON map/object builder | +| `jt` | `ja` | JSON tuple/array builder | +| `joinurl` | `joinurl` | Also exists as standalone (TS uses `join` with `url` flag) | + +--- + +## API Signature Differences + +### 1. `walk` signature differs slightly + +- **TS**: `walk(val, before?, after?, maxdepth?, key?, parent?, path?)` +- **Python**: `walk(val, apply=None, key=UNDEF, parent=UNDEF, path=UNDEF, *, before=None, after=None, maxdepth=None)` +- **Notes**: Python uses keyword-only arguments for `before`/`after`/`maxdepth` and also supports a positional `apply` for backward compatibility. This is actually a reasonable Pythonic adaptation. + +### 2. Default parameter handling uses `UNDEF` sentinel + +- **TS**: Uses `undefined` (language native). +- **Python**: Uses `UNDEF = None` as sentinel, since Python's `None` maps to JSON `null`. +- **Impact**: This is a necessary language adaptation. However, `UNDEF = None` conflates Python's `None` with "no value". The TS version distinguishes between `undefined` and `null`. + +### 3. `validate` return type + +- **TS**: Returns validated data; errors collected in `injdef.errs` array or thrown. +- **Python**: Same pattern via `injdef` with `errs` list. +- **Notes**: Aligned correctly. + +--- + +## Structural Differences + +### InjectState vs Injection Class + +- **TS**: Class named `Injection` with methods `descend()`, `child()`, `setval()`, `toString()`. +- **Python**: Class named `InjectState` with same methods. +- **Impact**: Minor naming difference. The class is functionally equivalent. + +### Type Constants + +- **TS** and **Python** both use bitfield type constants (`T_any`, `T_noval`, etc.). +- **Python**: All constants present and matching. +- **Notes**: Fully aligned. + +### SKIP/DELETE Sentinels + +- Both versions export `SKIP` and `DELETE` with matching structure. + +--- + +## Significant Language Difference Issues + +### 1. `None` vs `undefined`/`null` Distinction + +- **Issue**: Python has only `None`, while JavaScript/TypeScript distinguishes `undefined` from `null`. The library uses `UNDEF = None`, which means Python cannot natively distinguish "absent value" from "JSON null". +- **Workaround**: The test runner uses `NULLMARK = '__NULL__'` and `UNDEFMARK = '__UNDEF__'` string markers, and a `nullModifier` to convert between them. +- **Impact**: This is an inherent language limitation. The workaround is adequate but care must be taken in edge cases where the distinction matters. + +### 2. Dictionary Ordering + +- Python 3.7+ guarantees insertion-order dict preservation, but `keysof` returns sorted keys to match TS behavior. This is correct. + +### 3. No Symbol Type + +- Python has no equivalent of JavaScript `Symbol`. The `T_symbol` type constant exists but `typify` will never return it. +- **Impact**: Minimal; symbols are rarely used in the data structures this library processes. + +### 4. Integer vs Float Distinction + +- Python natively distinguishes `int` from `float`, which maps well to TS's `T_integer` vs `T_decimal`. +- **Impact**: Good alignment; Python may actually be more precise here. + +### 5. Function Identity in Clone + +- Both versions copy function references rather than cloning them. Python's `callable` check via `isfunc` works correctly for this. + +--- + +## Validation Differences + +- **TS**: Uses `$MAP`, `$LIST`, `$STRING`, `$NUMBER`, `$INTEGER`, `$DECIMAL`, `$BOOLEAN`, `$NULL`, `$NIL`, `$FUNCTION`, `$INSTANCE`, `$ANY`, `$CHILD`, `$ONE`, `$EXACT`. +- **Python**: Same validators present. +- **Notes**: Fully aligned. + +--- + +## Transform Differences + +- **TS**: Supports `$DELETE`, `$COPY`, `$KEY`, `$ANNO`, `$MERGE`, `$EACH`, `$PACK`, `$REF`, `$FORMAT`, `$APPLY`, `$BT`, `$DS`, `$WHEN`. +- **Python**: Same transform commands present. +- **Notes**: Fully aligned. + +--- + +## Test Coverage + +Python tests cover all major categories matching TS: +- Minor functions, walk, merge, getpath, inject, transform, validate, select, JSON builders. +- Test categories are comprehensive and use the shared `test.json` spec. + +### Minor Gaps +- Edge case tests may differ slightly in coverage. + +--- + +## Alignment Plan + +### Phase 1: Minor Fixes (Low Effort) +1. Add `replace(s, from_str, to)` function if missing +2. Verify `getdef(val, alt)` is exported (add if missing) +3. Consider renaming `jo`/`ja` to `jm`/`jt` to match TS (or add aliases) + +### Phase 2: Naming Alignment +4. Consider renaming `InjectState` to `Injection` to match TS class name +5. Ensure all type constant names exactly match TS + +### Phase 3: Edge Case Alignment +6. Review `None`/`UNDEF` handling for edge cases where TS distinguishes `undefined` from `null` +7. Verify `clone` behavior matches TS for all edge cases (functions, instances) +8. Run full test suite comparison against TS test.json to identify any failing cases + +### Phase 4: Documentation +9. Document the `None` vs `undefined`/`null` language difference and its implications +10. Document any Python-specific idioms used (keyword-only args in `walk`, etc.) diff --git a/py/tests/test_voxgig_struct.py b/py/tests/test_voxgig_struct.py index 0cafbcce..78caf9d8 100644 --- a/py/tests/test_voxgig_struct.py +++ b/py/tests/test_voxgig_struct.py @@ -21,7 +21,7 @@ ) from sdk import SDK -from voxgig_struct import InjectState +from voxgig_struct import Injection from voxgig_struct.voxgig_struct import ( T_noval, T_scalar, T_function, T_symbol, T_any, T_node, T_instance, T_null, ) @@ -510,7 +510,7 @@ def handler(inj, val, ref, store): # state.meta["step"] = state.meta["step"]+1 # return out - # state = InjectState( + # state = Injection( # meta = {"step":0}, # handler = handler_fn, # mode = "val", diff --git a/py/voxgig_struct/__init__.py b/py/voxgig_struct/__init__.py index 54f46092..6084fc25 100644 --- a/py/voxgig_struct/__init__.py +++ b/py/voxgig_struct/__init__.py @@ -5,6 +5,9 @@ delprop, escre, escurl, + filter, + flatten, + getdef, getelem, getpath, getprop, @@ -18,64 +21,52 @@ isnode, items, ja, + jm, jo, + join, joinurl, jsonify, + jt, keysof, merge, pad, pathify, + replace, select, + setpath, setprop, size, slice, stringify, strkey, transform, + typename, typify, validate, walk, - InjectState, + Injection, StructUtility, + checkPlacement, + injectorArgs, + injectChild, + SKIP, + DELETE, + T_any, + T_noval, + T_boolean, + T_decimal, + T_integer, + T_number, + T_string, + T_function, + T_symbol, + T_null, + T_list, + T_map, + T_instance, + T_scalar, + T_node, + M_KEYPRE, + M_KEYPOST, + M_VAL, ) - - -__all__ = [ - 'clone', - 'delprop', - 'escre', - 'escurl', - 'getelem', - 'getpath', - 'getprop', - 'haskey', - 'inject', - 'isempty', - 'isfunc', - 'iskey', - 'islist', - 'ismap', - 'isnode', - 'items', - 'ja', - 'jo', - 'joinurl', - 'jsonify', - 'keysof', - 'merge', - 'pad', - 'pathify', - 'select', - 'setprop', - 'size', - 'slice', - 'stringify', - 'strkey', - 'transform', - 'typify', - 'validate', - 'walk', - 'InjectState', - 'StructUtility', -] - diff --git a/py/voxgig_struct/voxgig_struct.py b/py/voxgig_struct/voxgig_struct.py index d32ec937..df343374 100644 --- a/py/voxgig_struct/voxgig_struct.py +++ b/py/voxgig_struct/voxgig_struct.py @@ -60,7 +60,7 @@ # Special keys. S_DKEY = '$KEY' -S_DMETA = '`$META`' +S_BANNO = '`$ANNO`' S_DTOP = '$TOP' S_DERRS = '$ERRS' S_DSPEC = '$SPEC' @@ -148,7 +148,7 @@ DELETE = {'`$DELETE`': True} -class InjectState: +class Injection: """ Injection state used for recursive injection into JSON-like data structures. """ @@ -212,12 +212,12 @@ def descend(self): return self.dparent - def child(self, keyI: int, keys: List[str]) -> 'InjectState': + def child(self, keyI: int, keys: List[str]) -> 'Injection': """Create a child state object with the given key index and keys.""" key = strkey(keys[keyI]) val = self.val - cinj = InjectState( + cinj = Injection( mode=self.mode, full=self.full, keyI=keyI, @@ -249,6 +249,13 @@ def setval(self, val: Any, ancestor: Optional[int] = None) -> Any: return setprop(getelem(self.nodes, 0 - ancestor), getelem(self.path, 0 - ancestor), val) +def getdef(val, alt): + "Get a defined value. Returns alt if val is undefined." + if val is UNDEF or val is None: + return alt + return val + + def isnode(val: Any = UNDEF) -> bool: "Value is a node - defined, and a map (hash) or list (array)." return isinstance(val, (dict, list)) @@ -569,6 +576,22 @@ def escurl(s: Any): return urllib.parse.quote(s, safe="") +def replace(s, from_pat, to_str): + "Replace a search string (all), or a regexp, in a source string." + rs = s + ts = typify(s) + if 0 == (T_string & ts): + rs = stringify(s) + elif 0 < ((T_noval | T_null) & ts): + rs = S_MT + else: + rs = stringify(s) + if isinstance(from_pat, str): + return rs.replace(from_pat, str(to_str)) + else: + return re.sub(from_pat, str(to_str), rs) + + def join(arr, sep=UNDEF, url=UNDEF): if not islist(arr): return S_MT @@ -699,6 +722,11 @@ def ja(*v: Any) -> List[Any]: return a +# Aliases to match TS canonical names +jm = jo +jt = ja + + def select_AND(state, _val, _ref, store): if S_MKEYPRE == state.mode: terms = getprop(state.parent, state.key) @@ -1199,8 +1227,8 @@ def getpath(store, path, injdef=UNDEF): return UNDEF val = store - # Support both dict-style injdef and InjectState instance - if isinstance(injdef, InjectState): + # Support both dict-style injdef and Injection instance + if isinstance(injdef, Injection): base = injdef.base dparent = injdef.dparent inj_meta = injdef.meta @@ -1287,7 +1315,7 @@ def getpath(store, path, injdef=UNDEF): val = getprop(val, part) # Injdef may provide a custom handler to modify found value. - handler = injdef.handler if isinstance(injdef, InjectState) else (getprop(injdef, 'handler') if injdef else UNDEF) + handler = injdef.handler if isinstance(injdef, Injection) else (getprop(injdef, 'handler') if injdef else UNDEF) if handler and isfunc(handler): ref = pathify(path) val = handler(injdef, val, ref, store) @@ -1335,14 +1363,14 @@ def inject(val, store, injdef=UNDEF): valtype = type(val) # Reuse existing injection state during recursion; otherwise create a new one. - if isinstance(injdef, InjectState): + if isinstance(injdef, Injection): inj = injdef else: inj = injdef # may be dict/UNDEF; used below via getprop # Create state if at root of injection. The input value is placed # inside a virtual parent holder to simplify edge cases. parent = {S_DTOP: val} - inj = InjectState( + inj = Injection( mode=S_MVAL, full=False, keyI=0, @@ -1555,16 +1583,16 @@ def transform_KEY(inj, val, ref, store): if ismap(inj.dparent) and inj.key is not UNDEF and haskey(inj.dparent, inj.key): return getprop(inj.dparent, inj.key) - meta = getprop(parent, S_DMETA) + meta = getprop(parent, S_BANNO) return getprop(meta, S_KEY, getprop(path, len(path) - 2)) -def transform_META(inj, val, ref, store): +def transform_ANNO(inj, val, ref, store): """ - Injection handler that removes the `'$META'` key (after capturing if needed). + Annotate node. Does nothing itself, just used by other injectors, and is removed when called. """ parent = inj.parent - setprop(parent, S_DMETA, UNDEF) + setprop(parent, S_BANNO, UNDEF) return UNDEF @@ -1658,7 +1686,7 @@ def transform_EACH(inj, val, ref, store): # Keep key in meta for usage by `$KEY` copy_child = clone(child_template) if ismap(copy_child): - setprop(copy_child, S_DMETA, {S_KEY: k}) + setprop(copy_child, S_BANNO, {S_KEY: k}) tval.append(copy_child) tcurrent = list(src.values()) if ismap(src) else src @@ -1734,7 +1762,7 @@ def transform_PACK(inj, val, ref, store): src_items = items(src) new_src = [] for item in src_items: - setprop(item[1], S_DMETA, {S_KEY: item[0]}) + setprop(item[1], S_BANNO, {S_KEY: item[0]}) new_src.append(item[1]) src = new_src else: @@ -1763,11 +1791,11 @@ def transform_PACK(inj, val, ref, store): tchild = clone(child) setprop(tval, k, tchild) - anno = getprop(srcnode, S_DMETA) + anno = getprop(srcnode, S_BANNO) if anno is UNDEF: - delprop(tchild, S_DMETA) + delprop(tchild, S_BANNO) else: - setprop(tchild, S_DMETA, anno) + setprop(tchild, S_BANNO, anno) rval = {} @@ -1955,7 +1983,7 @@ def injectorArgs(argTypes, args): return found -def _injectChild(child, store, inj): +def injectChild(child, store, inj): cinj = inj if inj.prior is not UNDEF and inj.prior is not None: if inj.prior.prior is not UNDEF and inj.prior.prior is not None: @@ -1982,7 +2010,7 @@ def transform_FORMAT(inj, _val, _ref, store): tkey = getelem(inj.path, -2) target = getelem(inj.nodes, -2, lambda: getelem(inj.nodes, -1)) - cinj = _injectChild(child, store, inj) + cinj = injectChild(child, store, inj) resolved = cinj.val formatter = name if 0 < (T_function & typify(name)) else getprop(FORMATTER, name) @@ -2016,7 +2044,7 @@ def transform_APPLY(inj, _val, _ref, store): tkey = getelem(inj.path, -2) target = getelem(inj.nodes, -2, lambda: getelem(inj.nodes, -1)) - cinj = _injectChild(child, store, inj) + cinj = injectChild(child, store, inj) resolved = cinj.val try: @@ -2087,7 +2115,7 @@ def transform( '$DELETE': transform_DELETE, '$COPY': transform_COPY, '$KEY': transform_KEY, - '$META': transform_META, + '$ANNO': transform_ANNO, '$MERGE': transform_MERGE, '$EACH': transform_EACH, '$PACK': transform_PACK, @@ -2604,6 +2632,7 @@ def __init__(self): self.escurl = escurl self.filter = filter self.flatten = flatten + self.getdef = getdef self.getelem = getelem self.getpath = getpath self.getprop = getprop @@ -2616,8 +2645,10 @@ def __init__(self): self.ismap = ismap self.isnode = isnode self.items = items - self.ja = ja + self.jm = jm + self.jt = jt self.jo = jo + self.ja = ja self.join = join self.joinurl = joinurl self.jsonify = jsonify @@ -2625,7 +2656,7 @@ def __init__(self): self.merge = merge self.pad = pad self.pathify = pathify - self.DELETE = DELETE + self.replace = replace self.select = select self.setpath = setpath self.setprop = setprop @@ -2638,19 +2669,50 @@ def __init__(self): self.typename = typename self.validate = validate self.walk = walk + + self.SKIP = SKIP + self.DELETE = DELETE + self.tn = typename + + self.T_any = T_any + self.T_noval = T_noval + self.T_boolean = T_boolean + self.T_decimal = T_decimal + self.T_integer = T_integer + self.T_number = T_number + self.T_string = T_string + self.T_function = T_function + self.T_symbol = T_symbol + self.T_null = T_null + self.T_list = T_list + self.T_map = T_map + self.T_instance = T_instance + self.T_scalar = T_scalar + self.T_node = T_node + + self.checkPlacement = checkPlacement + self.injectorArgs = injectorArgs + self.injectChild = injectChild __all__ = [ - 'InjectState', + 'Injection', 'StructUtility', + 'checkPlacement', 'clone', + 'delprop', 'escre', 'escurl', + 'filter', + 'flatten', + 'getdef', 'getelem', 'getpath', 'getprop', 'haskey', 'inject', + 'injectChild', + 'injectorArgs', 'isempty', 'isfunc', 'iskey', @@ -2658,19 +2720,49 @@ def __init__(self): 'ismap', 'isnode', 'items', + 'ja', + 'jm', + 'jo', + 'join', 'joinurl', + 'jsonify', + 'jt', 'keysof', 'merge', 'pad', 'pathify', + 'replace', + 'select', + 'setpath', 'setprop', 'size', 'slice', 'stringify', 'strkey', 'transform', + 'typename', 'typify', 'validate', 'walk', + 'SKIP', + 'DELETE', + 'T_any', + 'T_noval', + 'T_boolean', + 'T_decimal', + 'T_integer', + 'T_number', + 'T_string', + 'T_function', + 'T_symbol', + 'T_null', + 'T_list', + 'T_map', + 'T_instance', + 'T_scalar', + 'T_node', + 'M_KEYPRE', + 'M_KEYPOST', + 'M_VAL', ] diff --git a/rb/NOTES.md b/rb/NOTES.md new file mode 100644 index 00000000..48a61b84 --- /dev/null +++ b/rb/NOTES.md @@ -0,0 +1,22 @@ +# Ruby Implementation Notes + +## undefined vs null + +Ruby has only `nil` — there is no native distinction between "absent" and "null". +For this library: +- `UNDEF = Object.new.freeze` is used as a sentinel for **property absence** + (the TypeScript `undefined` equivalent). This is a unique frozen object that cannot + collide with any real data value. +- `nil` represents Ruby's native null, which maps to JSON null. +- TypeScript tests relating to `undefined` should be treated as **property absence**: the key + does not exist in the Hash, or the function parameter was not provided. +- Where the distinction matters, the test runner uses marker strings: + `NULLMARK = '__NULL__'` for JSON null and `UNDEFMARK = '__UNDEF__'` for absent values. +- In practice, most APIs do not use JSON null, so this ambiguity rarely causes issues. + +## Type System + +This implementation uses bitfield integers for the type system, matching the TypeScript canonical. +Type constants (`T_any`, `T_noval`, `T_boolean`, etc.) are defined as module constants and +`typify()` returns integer bitfields. Use `typename()` to get the human-readable name for +error messages. Bitwise operations allow composite type checks (e.g., `T_scalar | T_string`). diff --git a/rb/REVIEW.md b/rb/REVIEW.md new file mode 100644 index 00000000..fe30db94 --- /dev/null +++ b/rb/REVIEW.md @@ -0,0 +1,260 @@ +# Ruby (rb) - Review vs TypeScript Canonical + +## Overview + +The Ruby version is **partially complete**. It implements the basic utility functions and the core operations (inject, transform, validate), but many tests are **skipped**, suggesting the implementations may be incomplete or broken. Several functions present in TS are missing entirely. The API uses an older positional-parameter pattern rather than the unified `injdef` object. + +--- + +## Missing Functions + +| Function | Category | Impact | +|----------|----------|--------| +| `getelem` | Property access | No negative-index element access | +| `getdef` | Property access | No defined-or-default helper | +| `delprop` | Property access | No dedicated property deletion | +| `setpath` | Path operations | Cannot set values at nested paths | +| `select` | Query operations | No MongoDB-style query/filter | +| `size` | Collection | No unified size function | +| `slice` | Collection | No array/string slicing | +| `flatten` | Collection | No array flattening | +| `filter` | Collection | No predicate filtering | +| `pad` | String | No string padding | +| `replace` | String | No unified string replace | +| `join` | String | No general join (only `joinurl`) | +| `jsonify` | Serialization | No JSON serialization with formatting | +| `typename` | Type system | No type name function | +| `jm`/`jt` | JSON builders | No JSON builder functions | +| `checkPlacement` | Advanced | No placement validation | +| `injectorArgs` | Advanced | No injector argument validation | +| `injectChild` | Advanced | No child injection helper | + +--- + +## `typify` Returns Strings Instead of Bitfield + +- **TS**: Returns numeric bitfield with constants (`T_string`, `T_integer | T_number`, etc.). +- **Ruby**: Returns simple strings (`"null"`, `"string"`, `"number"`, `"boolean"`, `"function"`, `"array"`, `"object"`). +- **Impact**: The entire bitfield type system is missing. Cannot distinguish integer from decimal, no composite type checks, no `T_scalar`/`T_node` groupings. + +--- + +## No Type Constants + +The Ruby version has **no bitfield type constants** (`T_any`, `T_noval`, `T_boolean`, `T_decimal`, `T_integer`, `T_number`, `T_string`, `T_function`, `T_symbol`, `T_null`, `T_list`, `T_map`, `T_instance`, `T_scalar`, `T_node`). + +--- + +## API Signature Differences + +### 1. `inject` uses positional parameters + +- **TS**: `inject(val, store, injdef?)`. +- **Ruby**: `inject(val, store, modify=nil, current=nil, state=nil, flag=nil)`. +- **Impact**: Less extensible; harder to add new options. + +### 2. `transform` uses positional parameters + +- **TS**: `transform(data, spec, injdef?)`. +- **Ruby**: `transform(data, spec, extra=nil, modify=nil)`. + +### 3. `validate` uses positional parameters + +- **TS**: `validate(data, spec, injdef?)`. +- **Ruby**: `validate(data, spec, extra=nil, collecterrs=nil)`. + +### 4. `getpath` parameter order differs + +- **TS**: `getpath(store, path, injdef?)`. +- **Ruby**: `getpath(path, store, current=nil, state=nil)`. +- **Impact**: Different parameter order. + +### 5. `walk` has no `before`/`after` or `maxdepth` + +- **TS**: `walk(val, before?, after?, maxdepth?, key?, parent?, path?)`. +- **Ruby**: `walk(val, apply, key=nil, parent=nil, path=[])` - single callback, no depth limit. +- **Impact**: Post-order only, no depth protection. + +### 6. `setprop` overloads deletion + +- **TS**: Has separate `delprop`. +- **Ruby**: `setprop(parent, key, val=:no_val_provided)` - omitting val deletes. +- **Impact**: Different deletion semantics. + +### 7. `haskey` accepts variable arguments + +- **Ruby**: `haskey(*args)` accepts either `[val, key]` array or `(val, key)` separate args. +- **TS**: `haskey(val, key)` - always two parameters. +- **Impact**: Non-standard overloading. + +--- + +## Validation Differences + +- **TS**: Uses `$MAP`, `$LIST`, `$STRING`, `$NUMBER`, `$INTEGER`, `$DECIMAL`, `$BOOLEAN`, `$NULL`, `$NIL`, `$FUNCTION`, `$INSTANCE`, `$ANY`, `$CHILD`, `$ONE`, `$EXACT`. +- **Ruby**: Uses `$OBJECT`, `$ARRAY`, `$STRING`, `$NUMBER`, `$BOOLEAN`, `$FUNCTION`, `$ANY`, `$CHILD`, `$ONE`, `$EXACT`. +- **Missing**: `$MAP`, `$LIST`, `$INTEGER`, `$DECIMAL`, `$NULL`, `$NIL`, `$INSTANCE`. + +--- + +## Transform Differences + +- **TS**: Full set: `$DELETE`, `$COPY`, `$KEY`, `$ANNO`, `$MERGE`, `$EACH`, `$PACK`, `$REF`, `$FORMAT`, `$APPLY`, `$BT`, `$DS`, `$WHEN`. +- **Ruby**: Has `$DELETE`, `$COPY`, `$KEY`, `$META`, `$MERGE`, `$EACH`, `$PACK`. Missing: `$ANNO`, `$REF`, `$FORMAT`, `$APPLY`, `$BT`, `$DS`, `$WHEN`. +- **Impact**: Significantly fewer transform capabilities. + +--- + +## Skipped Tests (Critical Issue) + +The following tests are **explicitly skipped** in the test suite, indicating incomplete or broken implementations: + +- `test_transform_paths` - Path-based transforms +- `test_transform_cmds` - Command parsing +- `test_transform_each` - $EACH command +- `test_transform_pack` - $PACK command +- `test_transform_modify` - Custom modifier +- `test_transform_extra` - Custom handlers +- `test_validate_basic` - Basic validation +- `test_validate_child` - Nested validation +- `test_validate_one` - One-of validator +- `test_validate_exact` - Exact value matching +- `test_validate_invalid` - Error collection + +This means **most transform and all validate tests are skipped**, suggesting these implementations are incomplete. + +--- + +## Structural/Architectural Gaps + +### No Injection Class +- Uses plain hashes/state objects instead of a dedicated class. +- State management through `state` hash parameter. + +### Extra Helper Functions +- `deep_merge(a, b)` - Exposed as module function (TS keeps merge internal). +- `sorted(val)` - Recursive hash key sorting (TS handles this in stringify). +- `conv(val)` - UNDEF-to-nil conversion helper. +- `log(msg)` - Debug logging helper. + +### Internal Functions Exposed +- `_injectstr`, `_injecthandler`, `_setparentprop` are exposed (prefixed with `_` but still accessible). + +--- + +## Significant Language Difference Issues + +### 1. No `undefined` vs `null` Distinction + +- **Issue**: Ruby has only `nil`. +- **Workaround**: Uses `UNDEF = Object.new.freeze` as sentinel object. +- **Impact**: Better than string sentinel approaches (Python/PHP) since it's a unique object. Cannot accidentally match a real data value. + +### 2. Hash vs Array Distinction + +- **Issue**: Ruby clearly distinguishes `Hash` from `Array`, which is better than Lua/PHP. +- **Impact**: `ismap`/`islist` are straightforward. No ambiguity issues. + +### 3. Symbols vs Strings for Keys + +- **Issue**: Ruby Hashes can use either `:symbol` or `"string"` keys. JSON parsing typically produces string keys. +- **Impact**: All key operations must handle string keys. Symbol keys from Ruby-native code could cause issues. +- **Recommendation**: Ensure all key comparisons use string keys consistently. + +### 4. No Integer/Float Distinction in `typify` + +- **Issue**: Ruby has `Integer` and `Float` classes, but `typify` returns just `"number"` for both. +- **Impact**: Cannot distinguish integer from decimal at the type system level. +- **Recommendation**: When adding bitfield type system, use `val.is_a?(Integer)` vs `val.is_a?(Float)`. + +### 5. Procs vs Lambdas vs Methods + +- **Issue**: Ruby has multiple callable types: `Proc`, `Lambda`, `Method`, and blocks. +- **Impact**: `isfunc` uses `val.respond_to?(:call)`, which catches all callable types. This is correct behavior. + +### 6. `inject` Name Conflict + +- **Issue**: Ruby's `Enumerable#inject` (aka `reduce`) is a core method. The library's `inject` module function shadows this conceptually. +- **Impact**: No actual conflict since the library function is on the `VoxgigStruct` module, but it may confuse Ruby developers. + +--- + +## Test Coverage + +Tests use Minitest framework. Coverage is **incomplete**: +- Minor function tests: Present and passing +- Walk tests: Present and passing +- Merge tests: Present and passing +- Getpath tests: Present and passing +- Inject tests: Present and passing +- Transform tests: **Mostly skipped** +- Validate tests: **All skipped** +- Select tests: **Not present** (no `select` function) + +--- + +## Alignment Plan + +### Phase 1: Complete Transform Implementation (Critical) +1. Fix/complete `transform` to pass `transform-paths` tests +2. Fix/complete `transform` to pass `transform-cmds` tests +3. Fix/complete `transform_each` to pass `transform-each` tests +4. Fix/complete `transform_pack` to pass `transform-pack` tests +5. Add `transform_anno` ($ANNO command) +6. Add `transform_ref` ($REF command) +7. Add `transform_format` ($FORMAT command) +8. Add `transform_apply` ($APPLY command) +9. Add `$BT`, `$DS`, `$WHEN` support +10. Fix `transform-modify` and `transform-extra` support +11. Unskip all transform tests and ensure they pass + +### Phase 2: Complete Validate Implementation (Critical) +12. Fix/complete `validate` to pass `validate-basic` tests +13. Fix/complete `validate_child` to pass `validate-child` tests +14. Fix/complete `validate_one` to pass `validate-one` tests +15. Fix/complete `validate_exact` to pass `validate-exact` tests +16. Fix error collection to pass `validate-invalid` tests +17. Add `$MAP`, `$LIST`, `$INTEGER`, `$DECIMAL`, `$NULL`, `$NIL`, `$INSTANCE` validators +18. Unskip all validate tests and ensure they pass + +### Phase 3: Missing Core Functions +19. Implement `select(children, query)` with all operators ($AND, $OR, $NOT, $GT, $LT, $GTE, $LTE, $LIKE) +20. Implement `setpath(store, path, val, injdef)` +21. Implement `delprop(parent, key)` +22. Implement `getelem(val, key, alt)` with negative index support + +### Phase 4: Type System +23. Convert `typify` to return bitfield integers +24. Add all type constants (`T_any`, `T_noval`, `T_boolean`, etc.) +25. Add `typename(t)` function +26. Export `SKIP` and `DELETE` sentinels (if not already) + +### Phase 5: Missing Minor Functions +27. Add `getdef(val, alt)` helper +28. Add `size(val)` function +29. Add `slice(val, start, end, mutate)` function +30. Add `flatten(list, depth)` function +31. Add `filter(val, check)` function +32. Add `pad(str, padding, padchar)` function +33. Add `replace(s, from, to)` function +34. Add `join(arr, sep, url)` function +35. Add `jsonify(val, flags)` function +36. Add `jm`/`jt` JSON builder functions + +### Phase 6: API Signature Alignment +37. Refactor `walk` to support `before`/`after` callbacks and `maxdepth` +38. Refactor `inject` to use `injdef` object parameter +39. Refactor `transform` to use `injdef` object parameter +40. Refactor `validate` to use `injdef` object parameter +41. Align `getpath` parameter order to `(store, path, injdef)` +42. Normalize `haskey` to always take `(val, key)` parameters + +### Phase 7: Injection System +43. Create `Injection` class with `descend()`, `child()`, `setval()` methods +44. Add `checkPlacement`, `injectorArgs`, `injectChild` functions + +### Phase 8: Test Completion +45. Add select tests +46. Add tests for all new functions +47. Run full test suite against shared `test.json` +48. Remove all test skips diff --git a/rb/voxgig_struct.rb b/rb/voxgig_struct.rb index 4e9f127b..78da4388 100644 --- a/rb/voxgig_struct.rb +++ b/rb/voxgig_struct.rb @@ -25,20 +25,62 @@ def self.conv(val) S_DTOP = '$TOP' S_DERRS = '$ERRS' + S_any = 'any' S_array = 'array' S_boolean = 'boolean' + S_decimal = 'decimal' S_function = 'function' + S_instance = 'instance' + S_integer = 'integer' + S_list = 'list' + S_map = 'map' + S_nil = 'nil' + S_node = 'node' S_number = 'number' + S_null = 'null' S_object = 'object' + S_scalar = 'scalar' S_string = 'string' - S_null = 'null' + S_symbol = 'symbol' S_MT = '' # empty string constant (used as a prefix) S_BT = '`' S_DS = '$' S_DT = '.' # delimiter for key paths S_CN = ':' # colon for unknown paths + S_SP = ' ' + S_VIZ = ': ' S_KEY = 'KEY' + # Types - bitfield integers matching TypeScript canonical + _t = 31 + T_any = (1 << _t) - 1; _t -= 1 + T_noval = 1 << _t; _t -= 1 + T_boolean = 1 << _t; _t -= 1 + T_decimal = 1 << _t; _t -= 1 + T_integer = 1 << _t; _t -= 1 + T_number = 1 << _t; _t -= 1 + T_string = 1 << _t; _t -= 1 + T_function = 1 << _t; _t -= 1 + T_symbol = 1 << _t; _t -= 1 + T_null = 1 << _t; _t -= 8 + T_list = 1 << _t; _t -= 1 + T_map = 1 << _t; _t -= 1 + T_instance = 1 << _t; _t -= 5 + T_scalar = 1 << _t; _t -= 1 + T_node = 1 << _t + + TYPENAME = [ + S_any, S_nil, S_boolean, S_decimal, S_integer, S_number, S_string, + S_function, S_symbol, S_null, + '', '', '', '', '', '', '', + S_list, S_map, S_instance, + '', '', '', '', + S_scalar, S_node, + ] + + SKIP = { '`$SKIP`' => true } + DELETE = { '`$DELETE`' => true } + # Unique undefined marker. UNDEF = Object.new.freeze @@ -289,14 +331,55 @@ def self.joinurl(parts) end.reject { |s| s.empty? }.join('/') end + # Get type name string from type bitfield value. + def self.typename(t) + tname = S_MT + TYPENAME.each_with_index do |tn, tI| + if tn != S_MT && 0 < (t & (1 << (31 - tI))) + tname = tn + end + end + tname + end + + # Determine the type of a value as a bitfield integer. def self.typify(value) - return "null" if value.nil? - return "array" if islist(value) - return "object" if ismap(value) - return "boolean" if [true, false].include?(value) - return "function" if isfunc(value) - return "number" if value.is_a?(Numeric) - value.class.to_s.downcase + return T_noval if value.nil? + return T_noval if value.equal?(UNDEF) + + if value == true || value == false + return T_scalar | T_boolean + end + + if isfunc(value) + return T_scalar | T_function + end + + if value.is_a?(Integer) + return T_scalar | T_number | T_integer + end + + if value.is_a?(Float) + return value.nan? ? T_noval : (T_scalar | T_number | T_decimal) + end + + if value.is_a?(String) + return T_scalar | T_string + end + + if value.is_a?(Symbol) + return T_scalar | T_symbol + end + + if islist(value) + return T_node | T_list + end + + if ismap(value) + return T_node | T_map + end + + T_any end def self.walk(val, apply, key = nil, parent = nil, path = []) @@ -760,7 +843,7 @@ def self._invalid_type_msg(path, needtype, vt, v, _whence = nil) 'Expected ' + (path.length > 1 ? ('field ' + pathify(path, 1) + ' to be ') : '') + needtype + ', but found ' + - (v.nil? ? '' : vt + ': ') + vs + + (v.nil? ? '' : typename(vt) + S_VIZ) + vs + # Uncomment to help debug validation errors. # ' [' + _whence + ']' + '.' @@ -771,7 +854,7 @@ def self.validate_string(state, _val = nil, current = nil, _ref = nil, _store = out = getprop(current, state[:key]) t = typify(out) - if t != S_string + if 0 == (T_string & t) msg = _invalid_type_msg(state[:path], S_string, t, out, 'V1010') state[:errs].push(msg) return nil @@ -791,7 +874,7 @@ def self.validate_number(state, _val = nil, current = nil, _ref = nil, _store = out = getprop(current, state[:key]) t = typify(out) - if t != S_number + if 0 == (T_number & t) state[:errs].push(_invalid_type_msg(state[:path], S_number, t, out, 'V1020')) return nil end @@ -804,7 +887,7 @@ def self.validate_boolean(state, _val = nil, current = nil, _ref = nil, _store = out = getprop(current, state[:key]) t = typify(out) - if t != S_boolean + if 0 == (T_boolean & t) state[:errs].push(_invalid_type_msg(state[:path], S_boolean, t, out, 'V1030')) return nil end @@ -817,7 +900,7 @@ def self.validate_object(state, _val = nil, current = nil, _ref = nil, _store = out = getprop(current, state[:key]) t = typify(out) - if t != S_object + if 0 == (T_map & t) state[:errs].push(_invalid_type_msg(state[:path], S_object, t, out, 'V1040')) return nil end @@ -830,7 +913,7 @@ def self.validate_array(state, _val = nil, current = nil, _ref = nil, _store = n out = getprop(current, state[:key]) t = typify(out) - if t != S_array + if 0 == (T_list & t) state[:errs].push(_invalid_type_msg(state[:path], S_array, t, out, 'V1050')) return nil end @@ -843,7 +926,7 @@ def self.validate_function(state, _val = nil, current = nil, _ref = nil, _store out = getprop(current, state[:key]) t = typify(out) - if t != S_function + if 0 == (T_function & t) state[:errs].push(_invalid_type_msg(state[:path], S_function, t, out, 'V1060')) return nil end @@ -1072,19 +1155,19 @@ def self._validation(pval, key = nil, parent = nil, state = nil, current = nil, ptype = typify(pval) # Delete any special commands remaining. - return if ptype == S_string && pval.include?(S_DS) + return if 0 != (T_string & ptype) && pval.include?(S_DS) ctype = typify(cval) # Type mismatch. if ptype != ctype && !pval.nil? - state[:errs].push(_invalid_type_msg(state[:path], ptype, ctype, cval, 'V0010')) + state[:errs].push(_invalid_type_msg(state[:path], typename(ptype), ctype, cval, 'V0010')) return end if ismap(cval) if !ismap(pval) - state[:errs].push(_invalid_type_msg(state[:path], ptype, ctype, cval, 'V0020')) + state[:errs].push(_invalid_type_msg(state[:path], typename(ptype), ctype, cval, 'V0020')) return end @@ -1110,7 +1193,7 @@ def self._validation(pval, key = nil, parent = nil, state = nil, current = nil, end elsif islist(cval) if !islist(pval) - state[:errs].push(_invalid_type_msg(state[:path], ptype, ctype, cval, 'V0030')) + state[:errs].push(_invalid_type_msg(state[:path], typename(ptype), ctype, cval, 'V0030')) end else # Spec value was a default, copy over data