Skip to content
Discussion options

You must be logged in to vote

Yes, as long as you have enough GPU memory and RAM. Each claw is effectively another agent + model runtime process. On a 128‑GB Spark, two medium‑sized Nemotron instances are workable; beyond that you’ll trade off context length and KV‑cache size. For heavier multi‑agent workloads, consider multiple Sparks or off‑loading some agents to other GPUs.

Replies: 1 comment

Comment options

zNeill
Apr 2, 2026
Collaborator Author

You must be logged in to vote
0 replies
Answer selected by zNeill
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
1 participant