NVIDIA Jetson Orin Nano #660
BradHutchings
started this conversation in
General
Replies: 2 comments 5 replies
-
|
The bandwidth of 100GB/s is nice, probably there is also cuda acceleration, I don't know if llama.cpp supports it yet with the gpu though |
Beta Was this translation helpful? Give feedback.
1 reply
-
|
I have a jetson nano and have ran the latest builds of llamafile with it For llama 1B you get approx 50 tok/s for generation and 1500 for prompt processing |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Should a llamafile binary work on this new toy?
https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/jetson-orin/nano-super-developer-kit/
6 ARM cores, Ubuntu Linux, 8 GB RAM, a ton of CUDA cores.
They show benchmarks for a few 8B and 9B models running faster than whatever their baseline is.
Beta Was this translation helpful? Give feedback.
All reactions