Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Gemma 3 Support #440

Closed
5 tasks
yhg8423 opened this issue Mar 13, 2025 · 10 comments · Fixed by #444
Closed
5 tasks

feat: Gemma 3 Support #440

yhg8423 opened this issue Mar 13, 2025 · 10 comments · Fixed by #444
Assignees
Labels
new feature New feature or request released roadmap Part of the roadmap for node-llama-cpp (https://github.com/orgs/withcatai/projects/1)

Comments

@yhg8423
Copy link
Contributor

yhg8423 commented Mar 13, 2025

Feature Description

Currently, Google's Gemma 3 models are released. It would enhance its utility and expand compatibility for users seeking to integrate the latest models into their applications.

The Solution

The node-llama-cpp cannot load model architecture of Gemma 3 now. The information about Gemma 3 gguf should be added.

Considered Alternatives

It just added new model to library, so I think there is no other ways to alternate the solution.

Additional Context

No response

Related Features to This Feature Request

  • Metal support
  • CUDA support
  • Vulkan support
  • Grammar
  • Function calling

Are you willing to resolve this issue by submitting a Pull Request?

Yes, I have the time, and I know how to start.

@yhg8423 yhg8423 added new feature New feature or request requires triage Requires triaging labels Mar 13, 2025
@giladgd
Copy link
Contributor

giladgd commented Mar 13, 2025

I'll release a new version next week that will come with the latest llama.cpp release, which supports Gemma 3 models.

You can always build the latest llama.cpp release in node-llama-cpp to use the latest features, even before a new version of node-llama-cpp is released.

There's currently an issue with the latest release of llama.cpp, so you can use release b4880 for now, which includes support for Gemma 3, but doesn't have the issue from the latest release.
Run this command inside your project directory to download this release and build it:

npx --no node-llama-cpp source download --release b4880

@giladgd giladgd self-assigned this Mar 13, 2025
@giladgd giladgd added roadmap Part of the roadmap for node-llama-cpp (https://github.com/orgs/withcatai/projects/1) and removed requires triage Requires triaging labels Mar 13, 2025
@giladgd giladgd moved this to In Progress in node-llama-cpp: roadmap Mar 13, 2025
@yhg8423
Copy link
Contributor Author

yhg8423 commented Mar 14, 2025

Oh, I see! Thanks! 👍

@briancullinan2
Copy link

Command looked promising! Not sure where to go from here:

FAILED: CMakeFiles/llama-addon.dir/addon/AddonGrammarEvaluationState.cpp.o 
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++ -DGGML_BACKEND_SHARED -DGGML_SHARED -DGGML_USE_BLAS -DGGML_USE_CPU -DGGML_USE_METAL -DLLAMA_SHARED -DNAPI_VERSION=7 -Dllama_addon_EXPORTS -I/Users/briancullinan/jupyter_ops/node_modules/node-addon-api -I/Users/briancullinan/jupyter_ops/node_modules/node-api-headers/include -I/Users/briancullinan/jupyter_ops/node_modules/node-llama-cpp/llama/gpuInfo -I/Users/briancullinan/jupyter_ops/node_modules/node-llama-cpp/llama/llama.cpp -I/Users/briancullinan/jupyter_ops/node_modules/node-llama-cpp/llama/./llama.cpp/common -I/Users/briancullinan/jupyter_ops/node_modules/node-llama-cpp/llama/llama.cpp/src/. -I/Users/briancullinan/jupyter_ops/node_modules/node-llama-cpp/llama/llama.cpp/src/../include -I/Users/briancullinan/jupyter_ops/node_modules/node-llama-cpp/llama/llama.cpp/src/../common -I/Users/briancullinan/jupyter_ops/node_modules/node-llama-cpp/llama/llama.cpp/ggml/src/../include -I/Users/briancullinan/jupyter_ops/node_modules/node-llama-cpp/llama/llama.cpp/common/. -D_DARWIN_USE_64_BIT_INODE=1 -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -DBUILDING_NODE_EXTENSION -O3 -DNDEBUG -std=gnu++17 -arch arm64 -isysroot /Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX15.1.sdk -fPIC -fexceptions -Wno-c++17-extensions -MD -MT CMakeFiles/llama-addon.dir/addon/AddonGrammarEvaluationState.cpp.o -MF CMakeFiles/llama-addon.dir/addon/AddonGrammarEvaluationState.cpp.o.d -o CMakeFiles/llama-addon.dir/addon/AddonGrammarEvaluationState.cpp.o -c /Users/briancullinan/jupyter_ops/node_modules/node-llama-cpp/llama/addon/AddonGrammarEvaluationState.cpp
/Users/briancullinan/jupyter_ops/node_modules/node-llama-cpp/llama/addon/AddonGrammarEvaluationState.cpp:15:15: error: no matching function for call to 'llama_sampler_init_grammar'
   15 |     sampler = llama_sampler_init_grammar(model->model, grammarDef->grammarCode.c_str(), grammarDef->rootRuleName.c_str());
      |               ^~~~~~~~~~~~~~~~~~~~~~~~~~
/Users/briancullinan/jupyter_ops/node_modules/node-llama-cpp/llama/llama.cpp/src/../include/llama.h:1203:38: note: candidate function not viable: cannot convert argument of incomplete type 'llama_model *' to 'const struct llama_vocab *' for 1st argument
 1203 |     LLAMA_API struct llama_sampler * llama_sampler_init_grammar(
      |                                      ^
 1204 |             const struct llama_vocab * vocab,
      |             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
[58/72] Building CXX object llama.cpp/common/CMakeFiles/common.dir/common.cpp.o
[59/72] Building CXX object llama.cpp/common/CMakeFiles/common.dir/arg.cpp.o
[60/72] Building CXX object llama.cpp/common/CMakeFiles/common.dir/json-schema-to-grammar.cpp.o
[61/72] Building CXX object llama.cpp/common/CMakeFiles/common.dir/chat.cpp.o
ninja: build stopped: subcommand failed.
ERR! OMG Process terminated: 1
✖️ Failed to compile llama.cpp
Failed to build llama.cpp with Metal support. Error: SpawnError: Command npm run -s cmake-js-llama -- compile --log-level warn --config Release --arch=arm64 --out localBuilds/mac-arm64-metal-release-b4880 --runtime-version=22.9.0 --parallel=9 --CDGGML_METAL=1 --CDGGML_CCACHE=OFF exited with code 1

@giladgd
Copy link
Contributor

giladgd commented Mar 16, 2025

@briancullinan2 Do you use the latest version of node-llama-cpp? (version 3.6.0)
If you do, please run this command inside your project and attach its result so I can help you:

npx --no node-llama-cpp inspect gpu

@briancullinan2
Copy link

briancullinan2 commented Mar 16, 2025

I believe I'm using the latest version. Other models work fine, but that's the first time I've tried that npx compile command.

OS: macOS 24.3.0 (arm64)
Node: 22.9.0 (arm64)
node-llama-cpp: 3.6.0

Metal: available

Metal device: Apple M1 Max
Metal used VRAM: 0% (64KB/48GB)
Metal free VRAM: 99.99% (48GB/48GB)
Metal unified memory: 48GB (100%)

CPU model: Apple M1 Max
Math cores: 8
Used RAM: 99.6% (63.75GB/64GB)
Free RAM: 0.39% (257.2MB/64GB)
Used swap: 97.6% (26.35GB/27GB)
Max swap size: dynamic
mmap: supported

Is there another version I could try other than --release b4880 that would narrow down if it is my machine/build process. I imagine the distributed pre-built binaries are working in my case since it didn't rebuild on install? Maybe it is a bug in the build process on Mac?

@giladgd
Copy link
Contributor

giladgd commented Mar 21, 2025

@briancullinan2 Your machine seems fine, so I'm not sure what's causing the issue you encountered.
The issue with the latest llama.cpp release I mentioned in my original comment has been resolved since then, so you can now use this command instead:

npx --no node-llama-cpp source download --release latest

I'll release a new version today/tomorrow, so you can also wait for it if building still doesn't work for you, but I'd love to figure what's the issue you encountered so I can issue a fix for it if needed.

@briancullinan2
Copy link

--release latest worked

@giladgd
Copy link
Contributor

giladgd commented Mar 22, 2025

I have to delay the release a bit since there's currently an issue in llama.cpp that crashes the process when loading a LoRA.
Once this is fixed, the tests will pass and I'll release a new version.

For now, you can still build the latest release of llama.cpp yourself using this command, but note that using a LoRA can crash the process:

npx --no node-llama-cpp source download --release latest

@aldovincenti
Copy link

aldovincenti commented Mar 23, 2025

Will the new release also include support for the Phi-4-mini? Currently, it encounters an error while loading the model:

llama_model_load: error loading model: error loading model vocabulary: unknown pre-tokenizer type: 'gpt-4o'

@giladgd
Copy link
Contributor

giladgd commented Mar 23, 2025

@aldovincenti Yes. Every release of node-llama-cpp includes that latest release of llama.cpp available at the time it's built, so the next release of node-llama-cpp will bring support for everything currently supported by llama.cpp.
Until then, you can use the command from my previous message to use the latest models with the current release of node-llama-cpp.

@polvallverdu
Copy link

Hey! Is there any ETA on this, or we're still waiting on a new llama.cpp release? I'd love to add gemma 3 to BashBuddy

@briancullinan2
Copy link

Hey! Is there any ETA on this, or we're still waiting on a new llama.cpp release? I'd love to add gemma 3 to BashBuddy

Update using command npx --no node-llama-cpp source download --release latest Gemma 3 works. It's faster.

@giladgd
Copy link
Contributor

giladgd commented Mar 24, 2025

@polvallverdu I'll wait a few days for the LoRA issue to resolve at llama.cpp, and if it still persists, then I'll release a beta version and document it as a known issue, and release a stable version later whenever the issue is resolved.
I would have opened a PR to fix that issue myself, but I don't have enough time to continue debugging it at the moment.

@github-project-automation github-project-automation bot moved this from In Progress to Done in node-llama-cpp: roadmap Mar 27, 2025
Copy link

🎉 This issue has been resolved in version 3.7.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature New feature or request released roadmap Part of the roadmap for node-llama-cpp (https://github.com/orgs/withcatai/projects/1)
Projects
Development

Successfully merging a pull request may close this issue.

5 participants