Skip to content

Conversation

victorgveloso
Copy link

@victorgveloso victorgveloso commented Oct 14, 2025

Summary

This PR enables java-tree-sitter to load a bundled native tree-sitter binary shipped inside the project JAR. Instead of relying solely on System.loadLibrary(...) / the OS java.library.path, we extract the appropriate native artifact (per OS + arch) from the JAR to a temporary file at runtime and call System.load(). After the library is loaded with System.load, we use SymbolLookup.loaderLookup() to resolve native symbols via the Foreign Function & Memory API (FFM).

The change simplifies usage for downstream consumers (no manual make install or JAVA_LIBRARY_PATH fiddling for common platforms), while preserving the previous system-level lookup fallback.

Motivation

As discussed in the issue thread, running tests or sample code on platforms such as macOS (M1/M2) fails when libtree-sitter is not installed in the system library path. Users currently must:

  • install tree-sitter to /usr/local/lib (or another entry in java.library.path), or
  • copy .dylib/.so/.dll files into OS-specific extension directories.

This PR addresses that friction by allowing the library to be bundled in the JAR and loaded automatically at runtime. It also aligns with the design of SymbolLookup.loaderLookup() which is intended to see libraries previously loaded via System.load / System.loadLibrary.

What this PR changes

  • Adds findLibraryBundledInJar() implementation in ChainedLibraryLookup:
    • Detects OS and architecture (x86_64 / aarch64 / Linux/macOS/Windows).
    • Builds a list of candidate resource paths inside the JAR (e.g. /natives/{os}-{arch}/libtree-sitter.so, /native/{arch}/tree-sitter.dll, etc.).
    • Extracts the matching resource to a uniquely-named temporary file.
    • Calls System.load(tempFilePath).
    • Uses deleteOnExit() as a best-effort cleanup.
  • Keeps the existing fallback logic:
    • First try SymbolLookup.libraryLookup(...) (i.e. library on system paths),
    • then try the bundled extraction + SymbolLookup.loaderLookup(),
    • then try System.loadLibrary("tree-sitter") as a final fallback.
  • Leaves the previously-introduced lazy lookup fix (to call findLibrary(arena) lazily only when needed) in place — this prevents eager UnsatisfiedLinkError during static initialization when other SymbolLookups succeed.

Issues fixed

Close #80
Close #81

@victorgveloso
Copy link
Author

victorgveloso commented Oct 14, 2025

In this last commit I added the native libraries of TreeSitter latest release (v0.25.10) for three OSs (Linux, MacOS, and Windows) and two architectures (x86_64 and aarch64).

I tested most of them in my Macbook Air M2 and my dual boot (Windows/Linux) AMD64 machine. The only one left untested is the Linux aarch64 but I have a Raspberry Pi here that I can use to test. Let me know if should proceed with testing it.

To replicate the test. Simply follow the instructions below:

$ brew install curl maven git tar # On MacOS
$ sudo apt-get install curl maven git tar # On Debian-based Linux distros
$ # TODO add instructions for Windows here
$ git clone --recursive https://github.com/tree-sitter/java-tree-sitter/
$ git clone --recursive https://github.com/tree-sitter/tree-sitter-java/
$ cd tree-sitter-java
$ make
$ sudo make install
$ cd ../java-tree-sitter/core
$ make
$ sudo make install
$ mkdir ../..  && cd ../.. && curl https://download.java.net/java/early_access/jextract/22/6/openjdk-22-jextract+6-47_linux-x64_bin.tar.gz | tar -xzv
$ cd java-tree-sitter
$ PATH="../jextract-22/bin:${PATH}" mvn test

@ObserverOfTime
Copy link
Member

You're free to do this in a library that consumes jtreesitter but I don't want it in jtreesitter itself.

@victorgveloso
Copy link
Author

Ok, at least we can create a library that consumes jtreesitter and simplify its build thanks to the extensibility introduced by the Service Loader. However, if you're not accepting #153 because it's a feature you don't want in jtreesitter, please consider accepting #152 which is a required bugfix that enables such extensions to work when the native libraries are located outside java.library.path

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Should usage of SymbolLookup.loaderLookup() be opt-in? Unresolved symbol: ts_set_allocator (macOS)

2 participants