-
|
From the Mar 31 NemoClaw Livestream — Models, runtimes, and Nemotron‑specific behavior |
Beta Was this translation helpful? Give feedback.
Answered by
zNeill
Apr 2, 2026
Replies: 1 comment
-
|
Use the latest configs in the Nemotron on Spark repo, which set the model, KV‑cache, and runtime flags for long contexts. Ensure you allocate enough GPU memory to KV cache and be mindful that running multiple large agents on one box will limit per‑agent context. When in doubt, start with a single Nemotron 3 Super instance, verify context behavior, then scale out. |
Beta Was this translation helpful? Give feedback.
0 replies
Answer selected by
zNeill
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Use the latest configs in the Nemotron on Spark repo, which set the model, KV‑cache, and runtime flags for long contexts. Ensure you allocate enough GPU memory to KV cache and be mindful that running multiple large agents on one box will limit per‑agent context. When in doubt, start with a single Nemotron 3 Super instance, verify context behavior, then scale out.