Nvidia Blackwell Support#7
Conversation
There was a problem hiding this comment.
2 issues found across 7 files
Prompt for AI agents (all 2 issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="BENCHMARKS.md">
<violation number="1" location="BENCHMARKS.md:50">
P2: Incorrect import pattern in code example. The L0 API exports `run` directly from the module, so `from l0 import l0` will fail. Use `import l0` instead, and replace `json_rule()` with an actual exported guardrail like `l0.JSON_ONLY_GUARDRAILS` or `l0.Guardrails.recommended()`.</violation>
</file>
<file name="tests/test_benchmark.py">
<violation number="1" location="tests/test_benchmark.py:291">
P2: Inconsistent TTFT measurement: `start_time` is set after `_internal_run` completes, but in `run_baseline_benchmark` it's set before iteration. This makes time-to-first-token comparisons between baseline and L0 benchmarks invalid. Consider moving `start_time` before the `_internal_run` call to be consistent.</violation>
</file>
Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR
|
|
||
| Configure via `check_intervals`: | ||
| ```python | ||
| from l0 import l0 |
There was a problem hiding this comment.
P2: Incorrect import pattern in code example. The L0 API exports run directly from the module, so from l0 import l0 will fail. Use import l0 instead, and replace json_rule() with an actual exported guardrail like l0.JSON_ONLY_GUARDRAILS or l0.Guardrails.recommended().
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At BENCHMARKS.md, line 50:
<comment>Incorrect import pattern in code example. The L0 API exports `run` directly from the module, so `from l0 import l0` will fail. Use `import l0` instead, and replace `json_rule()` with an actual exported guardrail like `l0.JSON_ONLY_GUARDRAILS` or `l0.Guardrails.recommended()`.</comment>
<file context>
@@ -0,0 +1,97 @@
+
+Configure via `check_intervals`:
+```python
+from l0 import l0
+
+result = await l0.run(
</file context>
✅ Addressed in 768c1a8
| check_intervals=check_intervals, | ||
| ) | ||
|
|
||
| start_time = time.perf_counter() |
There was a problem hiding this comment.
P2: Inconsistent TTFT measurement: start_time is set after _internal_run completes, but in run_baseline_benchmark it's set before iteration. This makes time-to-first-token comparisons between baseline and L0 benchmarks invalid. Consider moving start_time before the _internal_run call to be consistent.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At tests/test_benchmark.py, line 291:
<comment>Inconsistent TTFT measurement: `start_time` is set after `_internal_run` completes, but in `run_baseline_benchmark` it's set before iteration. This makes time-to-first-token comparisons between baseline and L0 benchmarks invalid. Consider moving `start_time` before the `_internal_run` call to be consistent.</comment>
<file context>
@@ -0,0 +1,1044 @@
+ check_intervals=check_intervals,
+ )
+
+ start_time = time.perf_counter()
+
+ async for event in result:
</file context>
✅ Addressed in 9d74796
Summary by cubic
Optimized streaming guardrails and drift detection for Nvidia Blackwell–class throughput and added a full benchmark suite and docs. L0 now sustains 90K+ tokens/s with full features, providing ample headroom for 1000+ t/s models.
Performance
Benchmarks
Written for commit dcb1a47. Summary will update automatically on new commits.