Skip to content

Commit 4d26940

Browse files
authored
Merge pull request #52 from voxgig/claude/review-language-differences-b0FbZ
Add language alignment review reports for all non-TS implementations
2 parents d05df77 + 2fd8e8b commit 4d26940

30 files changed

+5765
-2343
lines changed

cpp/NOTES.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
# C++ Implementation Notes
2+
3+
## undefined vs null
4+
5+
C++ (via nlohmann::json) has only `null` (JSON null) — there is no native distinction between
6+
"absent" and "null".
7+
For this library:
8+
- `nullptr` / `json(nullptr)` is used to represent **property absence** (the TypeScript
9+
`undefined` equivalent).
10+
- TypeScript tests relating to `undefined` should be treated as **property absence**: the key
11+
does not exist in the JSON object, or the function parameter was not provided.
12+
- JSON null is ambiguous with absent. Where the distinction matters, the test runner should use
13+
marker strings: `NULLMARK = "__NULL__"` for JSON null and `UNDEFMARK = "__UNDEF__"` for absent values.
14+
- Consider using `std::optional<json>` to distinguish absent from null where needed.
15+
- In practice, most APIs do not use JSON null, so this ambiguity rarely causes issues.
16+
17+
## Type System
18+
19+
This implementation uses bitfield integers for the type system, matching the TypeScript canonical.
20+
Type constants (`T_any`, `T_noval`, `T_boolean`, etc.) are defined in the `VoxgigStruct` namespace
21+
and `typify()` returns integer bitfields. Use `typename()` to get the human-readable name for
22+
error messages. Bitwise operations allow composite type checks (e.g., `T_scalar | T_string`).

cpp/REVIEW.md

Lines changed: 234 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,234 @@
1+
# C++ (cpp) - Review vs TypeScript Canonical
2+
3+
## Overview
4+
5+
The C++ version is the **most incomplete** implementation, with only **~18 functions** out of 40+. It covers basic type checking, property access, walk, merge, stringify, and clone. All major subsystems (getpath, setpath, inject, transform, validate, select) are **entirely missing**. The API design uses an unusual pattern where all functions take `args_container&&` (vector of JSON values), which differs significantly from all other implementations.
6+
7+
---
8+
9+
## Missing Functions
10+
11+
### Critical (Core Operations - All Missing)
12+
| Function | Category | Impact |
13+
|----------|----------|--------|
14+
| `getpath` | Path operations | Cannot navigate nested structures by path |
15+
| `setpath` | Path operations | Cannot set values at nested paths |
16+
| `inject` | Injection | No value injection from store |
17+
| `transform` | Transform | No data transformation capability |
18+
| `validate` | Validation | No data validation capability |
19+
| `select` | Query | No query/filter on children |
20+
21+
### Minor Utilities (All Missing)
22+
| Function | Category | Impact |
23+
|----------|----------|--------|
24+
| `getelem` | Property access | No negative-index element access |
25+
| `getdef` | Property access | No defined-or-default helper |
26+
| `delprop` | Property access | No dedicated property deletion |
27+
| `size` | Collection | No unified size function |
28+
| `slice` | Collection | No array/string slicing |
29+
| `flatten` | Collection | No array flattening |
30+
| `filter` | Collection | No predicate filtering |
31+
| `pad` | String | No string padding |
32+
| `replace` | String | No unified string replace |
33+
| `join` | String | No general join function |
34+
| `jsonify` | Serialization | No JSON formatting |
35+
| `strkey` | String | No key-to-string conversion |
36+
| `typename` | Type system | No type name function |
37+
| `typify` | Type system | No type identification function |
38+
| `pathify` | String | No path-to-string conversion |
39+
| `jm`/`jt` | JSON builders | No JSON builder functions |
40+
| `checkPlacement` | Advanced | No placement validation |
41+
| `injectorArgs` | Advanced | No injector argument validation |
42+
| `injectChild` | Advanced | No child injection helper |
43+
44+
---
45+
46+
## Architectural Issues
47+
48+
### 1. All Functions Take `args_container&&`
49+
50+
- **TS**: Functions have named, typed parameters (e.g., `isnode(val: any)`).
51+
- **C++**: All functions take `args_container&&` (aka `std::vector<json>&&`), extracting parameters by position from the vector.
52+
- **Impact**:
53+
- No compile-time parameter validation.
54+
- No IDE autocompletion for parameters.
55+
- Runtime errors for wrong argument count/types.
56+
- Cannot distinguish between functions by signature.
57+
- Makes the API feel like a scripting language dispatch table rather than a C++ library.
58+
- **Recommendation**: Consider proper C++ function signatures with typed parameters. The `args_container` pattern should only be used for the test runner dispatch, not for the public API.
59+
60+
### 2. `walk` Uses `intptr_t` to Pass Function Pointers Through JSON
61+
62+
- **TS**: `walk(val, apply)` where `apply` is a function.
63+
- **C++**: The apply function pointer is cast to `intptr_t`, stored in a JSON number, and cast back when needed.
64+
- **Impact**:
65+
- **Extremely unsafe** - pointer-as-integer casting through JSON is undefined behavior.
66+
- Breaks if JSON library modifies the number (e.g., float conversion).
67+
- Not portable across architectures.
68+
- No type safety for the callback.
69+
- **Recommendation**: Redesign walk to take the callback as a separate parameter, not embedded in the args vector.
70+
71+
### 3. No Injection System
72+
73+
- No `Injection` class or equivalent.
74+
- No injection state management.
75+
- No type constants.
76+
- No `SKIP`/`DELETE` sentinels.
77+
78+
### 4. `clone` Is Shallow
79+
80+
- **TS**: Deep clones via `JSON.parse(JSON.stringify(val))`.
81+
- **C++**: Simple JSON copy (which is deep for nlohmann::json, but the implementation returns `nullptr` for null, suggesting it may not be handling all cases).
82+
83+
---
84+
85+
## Existing Function Issues
86+
87+
### 1. `isfunc` Uses Template Specialization
88+
89+
- **TS**: Simple `typeof val === 'function'` check.
90+
- **C++**: Complex template specialization that returns `true` for `std::function<json(args_container&&)>` and `false` for everything else.
91+
- **Impact**: Only detects one specific function type. Cannot detect lambdas, function pointers, or other callables.
92+
93+
### 2. `iskey` Handles Booleans
94+
95+
- The C++ version explicitly returns `false` for booleans in `iskey`, which is correct (matching TS behavior since `typeof true` is not `"string"` or `"number"` in JS). Good.
96+
97+
### 3. `setprop` Array Handling
98+
99+
- Uses direct vector pointer manipulation (`(*it).data()`) for array operations.
100+
- **Impact**: May have memory safety issues with iterator invalidation.
101+
102+
### 4. `stringify` Is Minimal
103+
104+
- Basic `dump()` call with quote stripping and optional truncation.
105+
- Missing: sorted keys, custom formatting, depth handling.
106+
107+
---
108+
109+
## Significant Language Difference Issues
110+
111+
### 1. No Dynamic Typing
112+
113+
- **Issue**: C++ is statically typed. The library uses `nlohmann::json` to provide dynamic JSON values, but C++ has no native equivalent of JavaScript's dynamic typing.
114+
- **Impact**: Every operation requires type checking at runtime through the JSON library's type system. This is verbose but functional.
115+
116+
### 2. No `undefined` vs `null` Distinction
117+
118+
- **Issue**: `nlohmann::json` has `null` but no `undefined`.
119+
- **Impact**: Same as all other non-JS implementations.
120+
121+
### 3. Memory Management
122+
123+
- **Issue**: C++ requires explicit memory management. The `nlohmann::json` type handles its own memory, but function pointers, callbacks, and the `Utility`/`Provider` classes have manual memory management.
124+
- **Impact**: Potential for memory leaks or use-after-free in the `Provider` and `Utility` classes.
125+
126+
### 4. No Garbage Collection
127+
128+
- **Issue**: Circular references in data structures cannot be automatically collected.
129+
- **Impact**: The `walk` function must be careful about reference cycles. The `inject` system (when implemented) must handle cycles explicitly.
130+
131+
### 5. No Regular Expression Literals
132+
133+
- **Issue**: C++ uses `<regex>` library with string-based patterns.
134+
- **Impact**: `escre` manually escapes characters (correct approach).
135+
136+
### 6. No Closures as First-Class Citizens (Pre-C++11)
137+
138+
- **Issue**: C++11 lambdas exist but are not JSON-serializable. The current approach of casting function pointers to integers is extremely fragile.
139+
- **Impact**: The callback/handler system needs a completely different design from TS.
140+
141+
### 7. No Exception-Safe Error Collection
142+
143+
- **Issue**: C++ exception handling is more expensive than JS try/catch. The validate system (when implemented) should prefer error collection over exceptions.
144+
- **Impact**: Design consideration for future implementation.
145+
146+
### 8. Template Metaprogramming Complexity
147+
148+
- **Issue**: The `isfunc` template specialization pattern is complex and fragile. Adding new callable types requires new specializations.
149+
- **Impact**: Consider using a simpler runtime check instead.
150+
151+
---
152+
153+
## Test Coverage
154+
155+
Minimal test coverage:
156+
- Minor function tests: `isnode`, `ismap`, `islist`, `iskey`, `isempty`, `isfunc`, `getprop`, `keysof`, `haskey`, `items`, `escre`, `escurl`, `joinurl`, `stringify`, `clone`, `setprop`
157+
- Walk: `walk-basic` only
158+
- Merge: `merge-basic` only
159+
- **No tests for**: getpath, setpath, inject, transform, validate, select (functions don't exist)
160+
161+
Uses Catch2-style test framework with shared `test.json` spec.
162+
163+
---
164+
165+
## Alignment Plan
166+
167+
### Phase 1: API Redesign (Critical - Do First)
168+
1. **Redesign function signatures** to use proper C++ parameters instead of `args_container&&`
169+
- Example: `json isnode(const json& val)` instead of `json isnode(args_container&& args)`
170+
- Keep `args_container` dispatch only for the test runner
171+
2. **Remove intptr_t function pointer casting** from `walk`
172+
- Pass callback as `std::function<json(const std::string&, const json&, const json&, const std::vector<std::string>&)>`
173+
3. **Fix isfunc** to use a runtime callable check or a dedicated `JsonFunction` wrapper
174+
175+
### Phase 2: Missing Minor Functions
176+
4. Add `typify(val)` returning bitfield integers
177+
5. Add all type constants (`T_any`, `T_noval`, `T_boolean`, etc.)
178+
6. Add `typename(t)` function
179+
7. Add `strkey(key)` function
180+
8. Add `getelem(val, key, alt)` with negative index support
181+
9. Add `getdef(val, alt)` helper
182+
10. Add `delprop(parent, key)` function
183+
11. Add `size(val)` function
184+
12. Add `slice(val, start, end)` function
185+
13. Add `flatten(list, depth)` function
186+
14. Add `filter(val, check)` function
187+
15. Add `pad(str, padding, padchar)` function
188+
16. Add `replace(s, from, to)` function
189+
17. Add `join(arr, sep, url)` function
190+
18. Add `jsonify(val, flags)` function
191+
19. Add `pathify(val, startin, endin)` function
192+
20. Add `jm`/`jt` JSON builder functions
193+
21. Add `SKIP` and `DELETE` sentinel values
194+
195+
### Phase 3: Path Operations
196+
22. Implement `getpath(store, path, injdef)` with full path syntax support
197+
23. Implement `setpath(store, path, val, injdef)`
198+
199+
### Phase 4: Injection System
200+
24. Design and implement `Injection` class/struct
201+
25. Implement `inject(val, store, injdef)` with full injection system
202+
26. Implement `injectChild(child, store, inj)`
203+
27. Add `checkPlacement` and `injectorArgs` functions
204+
205+
### Phase 5: Transform
206+
28. Implement `transform(data, spec, injdef)` with all commands:
207+
- `$DELETE`, `$COPY`, `$KEY`, `$ANNO`, `$MERGE`, `$EACH`, `$PACK`
208+
- `$REF`, `$FORMAT`, `$APPLY`, `$BT`, `$DS`, `$WHEN`
209+
210+
### Phase 6: Validate
211+
29. Implement `validate(data, spec, injdef)` with all validators:
212+
- `$MAP`, `$LIST`, `$STRING`, `$NUMBER`, `$INTEGER`, `$DECIMAL`
213+
- `$BOOLEAN`, `$NULL`, `$NIL`, `$FUNCTION`, `$INSTANCE`, `$ANY`
214+
- `$CHILD`, `$ONE`, `$EXACT`
215+
216+
### Phase 7: Select
217+
30. Implement `select(children, query)` with operators:
218+
- `$AND`, `$OR`, `$NOT`, `$GT`, `$LT`, `$GTE`, `$LTE`, `$LIKE`
219+
220+
### Phase 8: Walk Enhancement
221+
31. Add `before`/`after` callback support to `walk`
222+
32. Add `maxdepth` parameter
223+
224+
### Phase 9: Test Coverage
225+
33. Add tests for all new functions using shared `test.json`
226+
34. Add all test categories matching TS suite
227+
35. Ensure memory safety (run with AddressSanitizer)
228+
36. Ensure no undefined behavior (run with UBSan)
229+
230+
### Phase 10: Code Quality
231+
37. Add proper error handling (not exceptions for expected cases)
232+
38. Review memory management in Utility/Provider classes
233+
39. Add const-correctness throughout
234+
40. Consider using `std::optional` for absent values

cpp/src/utility_decls.hpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,7 @@
55

66
#include <sstream>
77
#include <iomanip>
8+
#include <cmath>
89

910
#include <regex>
1011

cpp/src/voxgig_struct.hpp

Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,99 @@ namespace VoxgigStruct {
77

88
namespace S {
99
const std::string empty = "";
10+
const std::string any = "any";
11+
const std::string nil = "nil";
12+
const std::string boolean_s = "boolean";
13+
const std::string decimal = "decimal";
14+
const std::string integer = "integer";
15+
const std::string number = "number";
16+
const std::string string_s = "string";
17+
const std::string function_s = "function";
18+
const std::string symbol = "symbol";
19+
const std::string null_s = "null";
20+
const std::string list = "list";
21+
const std::string map = "map";
22+
const std::string instance = "instance";
23+
const std::string scalar = "scalar";
24+
const std::string node = "node";
25+
const std::string viz = ": ";
1026
};
1127

28+
// Type constants - bitfield integers matching TypeScript canonical.
29+
constexpr int T_any = (1 << 31) - 1;
30+
constexpr int T_noval = 1 << 30;
31+
constexpr int T_boolean = 1 << 29;
32+
constexpr int T_decimal = 1 << 28;
33+
constexpr int T_integer = 1 << 27;
34+
constexpr int T_number = 1 << 26;
35+
constexpr int T_string = 1 << 25;
36+
constexpr int T_function = 1 << 24;
37+
constexpr int T_symbol = 1 << 23;
38+
constexpr int T_null = 1 << 22;
39+
constexpr int T_list = 1 << 14;
40+
constexpr int T_map = 1 << 13;
41+
constexpr int T_instance = 1 << 12;
42+
constexpr int T_scalar = 1 << 7;
43+
constexpr int T_node = 1 << 6;
44+
45+
const std::string TYPENAME[] = {
46+
S::any, S::nil, S::boolean_s, S::decimal, S::integer, S::number, S::string_s,
47+
S::function_s, S::symbol, S::null_s,
48+
"", "", "", "", "", "", "",
49+
S::list, S::map, S::instance,
50+
"", "", "", "",
51+
S::scalar, S::node,
52+
};
53+
constexpr int TYPENAME_LEN = 26;
54+
55+
// Get type name string from type bitfield value.
56+
inline std::string typename_of(int t) {
57+
std::string tname = "";
58+
for (int tI = 0; tI < TYPENAME_LEN; tI++) {
59+
if (!TYPENAME[tI].empty() && 0 < (t & (1 << (31 - tI)))) {
60+
tname = TYPENAME[tI];
61+
}
62+
}
63+
return tname;
64+
}
65+
66+
// Determine the type of a value as a bitfield integer.
67+
inline int typify(const json& value) {
68+
if (value.is_null()) {
69+
return T_noval;
70+
}
71+
72+
if (value.is_boolean()) {
73+
return T_scalar | T_boolean;
74+
}
75+
76+
if (value.is_number_integer()) {
77+
return T_scalar | T_number | T_integer;
78+
}
79+
80+
if (value.is_number_float()) {
81+
double d = value.get<double>();
82+
if (std::isnan(d)) {
83+
return T_noval;
84+
}
85+
return T_scalar | T_number | T_decimal;
86+
}
87+
88+
if (value.is_string()) {
89+
return T_scalar | T_string;
90+
}
91+
92+
if (value.is_array()) {
93+
return T_node | T_list;
94+
}
95+
96+
if (value.is_object()) {
97+
return T_node | T_map;
98+
}
99+
100+
return T_any;
101+
}
102+
12103
inline json isnode(args_container&& args) {
13104
json val = args.size() == 0 ? nullptr : std::move(args[0]);
14105

go/NOTES.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
# Go Implementation Notes
2+
3+
## undefined vs null
4+
5+
Go has only `nil` — there is no native distinction between "absent" and "null".
6+
For this library:
7+
- `nil` is used to represent **property absence** (the TypeScript `undefined` equivalent).
8+
- TypeScript tests relating to `undefined` should be treated as **property absence**: the key
9+
does not exist in the map, or the function parameter was not provided.
10+
- JSON null is ambiguous with `nil`. Where the distinction matters, the test runner uses
11+
marker strings: `NULLMARK = "__NULL__"` for JSON null and `UNDEFMARK = "__UNDEF__"` for absent values.
12+
- In practice, most APIs do not use JSON null, so this ambiguity rarely causes issues.
13+
14+
## Type System
15+
16+
This implementation uses bitfield integers for the type system, matching the TypeScript canonical.
17+
Type constants (`T_any`, `T_noval`, `T_boolean`, etc.) are exported and `Typify()` returns
18+
integer bitfields. Use `Typename()` to get the human-readable name for error messages.
19+
Bitwise operations allow composite type checks (e.g., `T_scalar | T_string`).

0 commit comments

Comments
 (0)