Skip to content

[Java.Interop] JNI handles are now in a "control block" #1334

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

jonpryor
Copy link
Contributor

Context: dotnet/runtime#114184
Context: dotnet/android#10125
Context: dotnet/android#10125 (comment)

Part of adding a GC bridge to CoreCLR are the new APIs:

namespace System.Runtime.InteropServices.Java;
public struct ComponentCrossReference {
    public nint SourceGroupIndex, DestinationGroupIndex;
}
public unsafe struct StronglyConnectedComponent {
    public nint Count;
    public IntPtr* Context;
}
public static partial class JavaMarshal {
    public static unsafe void Initialize(
        delegate* unmanaged<
            System.IntPtr,                  // sccsLen
            StronglyConnectedComponent*,    // sccs
            System.IntPtr,                  // ccrsLen
            ComponentCrossReference*,       // ccrs
            void> markCrossReferences);
    public static GCHandle CreateReferenceTrackingHandle(object obj, IntPtr context);
    public static IntPtr GetContext(GCHandle obj);
}

Of note is the "data flow" of context:

  • JavaMarshal.CreateReferenceTrackingHandle() has a "context" parameter.

  • The context parameter to JavaMarshal.CreateReferenceTrackingHandle() is the return value of JavaMarshal.GetContext()

  • The context parameter to JavaMarshal.CreateReferenceTrackingHandle() is stored within StronglyConnectedComponent.Context.

  • The markCrossReferences parameter of JavaMarshal.Initialize() is called by the GC bridge and given a native array of StronglyConnectedComponent instances, which contains Context.

The short of it is that the proposed GC bridge doesn't contain direct access to the IJavaPeerable instances in play. Instead, it has access to "context" which must contain the JNI Object Reference information that the markCrossReferences callback needs access to.

Furthermore, the context pointer value cannot change, i.e. it needs to be a native pointer value a'la malloc(3), not a value which can be moved by the GC. (The contents can change; the pointer value cannot.))

While we're still prototyping this, what we currently believe we need is the JNI object reference, JNI object reference type, and (maybe?) the JNI Weak Global Reference value and "refs added" values.

Update IJavaPeerable to add a JniObjectReferenceControlBlock property which can be used as the context parameter:

partial interface IJavaPeerable {
    IntPtr JniObjectReferenceControlBlock => 0;
}

This supports usage of:

IJavaPeerable   value   = …
GCHandle        handle  = JavaMarshal.CreateReferenceTrackingHandle(
    value,
    value.JniObjectReferenceControlBlock
);

Context: dotnet/runtime#114184
Context: dotnet/android#10125
Context: dotnet/android#10125 (comment)

Part of adding a GC bridge to CoreCLR are the new APIs:

	namespace System.Runtime.InteropServices.Java;
	public struct ComponentCrossReference {
	    public nint SourceGroupIndex, DestinationGroupIndex;
	}
	public unsafe struct StronglyConnectedComponent {
	    public nint Count;
	    public IntPtr* Context;
	}
	public static partial class JavaMarshal {
	    public static unsafe void Initialize(
	        delegate* unmanaged<
	            System.IntPtr,                  // sccsLen
	            StronglyConnectedComponent*,    // sccs
	            System.IntPtr,                  // ccrsLen
	            ComponentCrossReference*,       // ccrs
	            void> markCrossReferences);
	    public static GCHandle CreateReferenceTrackingHandle(object obj, IntPtr context);
	    public static IntPtr GetContext(GCHandle obj);
	}

Of note is the "data flow" of `context`:

  * `JavaMarshal.CreateReferenceTrackingHandle()` has a "`context`"
    parameter.

  * The `context` parameter to
    `JavaMarshal.CreateReferenceTrackingHandle()` is the return value
    of `JavaMarshal.GetContext()`

  * The `context` parameter to
    `JavaMarshal.CreateReferenceTrackingHandle()` is stored within
    `StronglyConnectedComponent.Context`.

  * The `markCrossReferences` parameter of `JavaMarshal.Initialize()`
    is called by the GC bridge and given a native array of
    `StronglyConnectedComponent` instances, which contains `Context`.

The short of it is that the proposed GC bridge doesn't contain direct
access to the `IJavaPeerable` instances in play.  Instead, it has
access to "context" which must contain the JNI Object Reference
information that the `markCrossReferences` callback needs access to.

Furthermore, the `context` pointer value *cannot change*, i.e. it
needs to be a native pointer value a'la **malloc**(3), ***not*** a
value which can be moved by the GC.  (The *contents* can change; the
pointer value cannot.))

While we're still prototyping this, what we currently believe we need
is the JNI object reference, JNI object reference type, and (maybe?)
the JNI Weak Global Reference value and "refs added" values.

Update `IJavaPeerable` to add a `JniObjectReferenceControlBlock`
property which can be used as the `context` parameter:

	partial interface IJavaPeerable {
	    IntPtr JniObjectReferenceControlBlock => 0;
	}

This supports usage of:

	IJavaPeerable   value   = …
	GCHandle        handle  = JavaMarshal.CreateReferenceTrackingHandle(
	    value,
	    value.JniObjectReferenceControlBlock
	);
@jonpryor
Copy link
Contributor Author

jonpryor commented May 10, 2025

While hand-waving… TODO:

  • Update JavaException as JavaObject was updated.
  • Update src/java-interop so that the MonoVM GC bridge works with this new "control block" strategy.
  • Make sure that src/java-interop works; see also a5b229d, 93c3872
  • Update the build system so that .NET 10 can be used to build things.
  • Update ManagedValueManager (?) to begin calling JavaMarshal APIs when building against .NET 10.
  • Does It Work™?

jonpryor added 3 commits May 12, 2025 07:28
It works!

Build:

	dotnet build -c Release -p:DotNetTargetFramework=net8.0 -t:Prepare
	dotnet build -c Release -p:DotNetTargetFramework=net8.0
	dotnet publish --self-contained -p:UseMonoRuntime=true -p:DotNetTargetFramework=net8.0 \
	  -p:UseAppHost=true -p:ErrorOnDuplicatePublishOutputFiles=false \
	  samples/Hello-Java.Base/Hello-Java.Base.csproj -r osx-x64

Run:

	JAVA_INTEROP_GREF_LOG=g.txt  ./samples/Hello-Java.Base/bin/Release/osx-x64/publish/Hello-Java.Base

`g.txt` contains:

	…
	*take_weak obj=0x10560dc70; handle=0x7f7f42f15148
	+w+ grefc 8 gwrefc 1 obj-handle 0x7f7f42f15148/G -> new-handle 0x7f7f43008401/W from thread '(null)'(1)
	take_weak_global_ref_jni
	-g- grefc 7 gwrefc 1 handle 0x7f7f42f15148/G from thread '(null)'(1)
	take_weak_global_ref_jni
	*try_take_global obj=0x10560dc70 -> wref=0x7f7f43008401 handle=0x0
	-w- grefc 7 gwrefc 0 handle 0x7f7f43008401/W from thread '(null)'(1)
	take_global_ref_jni
	Finalizing PeerReference=0x0/G IdentityHashCode=0x70dea4e Instance=0x59b98e2b Instance.Type=Hello.MyJLO
A "funny" thing happened after 628aa39: a teardown assertion fired!

	% dotnet test bin/TestRelease-net8.0/Java.Interop-Tests.dll
	…
	# jonp: LoadJvmLibrary(/Users/runner/hostedtoolcache/Java_Temurin-Hotspot_jdk/17.0.15-6/x64/Contents/Home/lib/libjli.dylib)=140707427245328
	# jonp: JNI_CreateJavaVM=7113469040; JNI_GetCreatedJavaVMs=7113469120
	# jonp: executing JNI_CreateJavaVM=1a7feec70
	# jonp: r=0 javavm=1a9dc1830 jnienv=7fe37d0422b0
	TearDown failed for test fixture Java.InteropTests.JavaObjectArray_object_ContractTest
	  JNI global references: grefStartCount=324; gref=340
	  Expected: True
	  But was:  False

	TearDown : NUnit.Framework.AssertionException :   JNI global references: grefStartCount=324; gref=340
	  Expected: True
	  But was:  False

	StackTrace:    at Java.InteropTests.JavaObjectArray_object_ContractTest.EndCheckGlobalRefCount() in /Users/runner/work/1/s/tests/Java.Interop-Tests/Java.Interop/JavaObjectArrayTest.cs:line 158

	--TearDown
	   at NUnit.Framework.Assert.ReportFailure(String message)
	   at NUnit.Framework.Assert.ReportFailure(ConstraintResult result, String message, Object[] args)
	   at NUnit.Framework.Assert.That[TActual](TActual actual, IResolveConstraint expression, String message, Object[] args)
	   at NUnit.Framework.Assert.IsTrue(Boolean condition, String message, Object[] args)
	   at Java.InteropTests.JavaObjectArray_object_ContractTest.EndCheckGlobalRefCount() in /Users/runner/work/1/s/tests/Java.Interop-Tests/Java.Interop/JavaObjectArrayTest.cs:line 158
	   at System.RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor)
	   at System.Reflection.MethodBaseInvoker.InvokeWithNoArgs(Object obj, BindingFlags invokeAttr)
	  Skipped References_CreatedReference_GlobalRef [3 ms]
	  Skipped References_CreatedReference_LocalRef [< 1 ms]
	  Skipped DoesTheJmethodNeedToMatchDeclaringType [5 ms]
	# jonp: LoadJvmLibrary(/Users/runner/hostedtoolcache/Java_Temurin-Hotspot_jdk/17.0.15-6/x64/Contents/Home/lib/libjli.dylib)=140707427245328
	# jonp: JNI_CreateJavaVM=7113469040; JNI_GetCreatedJavaVMs=7113469120
	# jonp: executing JNI_CreateJavaVM=1a7feec70
	# jonp: r=-5 javavm=0 jnienv=0

Unfortunately, teardown assertions "don't count", i.e. CI was green,
even though the above is copied from the CI run.

The cause of the problem?  Having `JavaObject.Dispose()`
call `JniObjectReferenceControlBlock.Free()`.

The underlying problem is that
`JniRuntime.JniValueManager.DisposePeer()` uses `JavaObject.PeerReference`
*after* calling `JavaObject.Disposed()` (which calls
`JavaObject.Dispose(bool)`), yet `JavaObject.Dispose(bool)` released
the `JniObjectReferenceControlBlock` that backs `PeerReference`,
meaning we "lost" (leaked!) the GREF!

There are two possible solutions:

 1. Update `JniRuntime.JniValueManager.DisposePeer()` to use
    `JavaObject.PeerReference` *before* calling `JavaObject.Disposed()`

 2. Update `JavaObject.Dispose(bool)` to not release the
    `JniObjectReferenceControlBlock` that backs `PeerReference`.

I avoided (1) because I didn't want to have to audit and update
dotnet/android and all other potential callsites.

Which brings us to (2): if not in `Dispose(bool)`, then where?

The last thing that `JniRuntime.JniValueManager.DisposePeer()` and
`JniRuntime.JniValueManager.FinalizePeer()` do is call
`JavaObject.SetPeerReference(default)`.  *This*, then, is where
cleanup needs to happen.

Update `JavaObject` and `JavaException` to use the
`JniManagedPeerStates` field to track "have I been disposed?",
and then update `SetPeerReference()` to call
`JniObjectReferenceControlBlock.Free()` if the state is "disposed".

This fixes the leak.
@jonpryor
Copy link
Contributor Author

The TODO list says:

Update the build system so that .NET 10 can be used to build things.

This is already supported, courtesy of overriding $(DotNetTargetFramework):

% /path/to/dotnet/android/dotnet-local.sh build -p:DotNetTargetFramework=net10.0 -t:Prepare
% /path/to/dotnet/android/dotnet-local.sh build -p:DotNetTargetFramework=net10.0

jonpryor added 3 commits May 13, 2025 11:19
Context: dotnet/runtime#115506
Context: dotnet/android#10125

We have an *API*, but not (yet) usable *implementation*.

"Import" the `ManagedValueManager` from dotnet/android#10125, renaming
to `JavaBridgedValueManager`, and add the proposed bridge API
from dotnet/runtime#115506 to verify that it all compiles.

It compiles!

Next step: does it *work*?

If I squint right, the proposed API looks very very similar to the
existing MonoVM GC bridge API.  Can I implement the proposed API
in terms of MonoVM, and then have C# code perform the bridge code
instead of native code, when using MonoVM?

Let's find out!
Commit 9e9daf4 suggested that we could
probably "thunk" the MonoVM API to implement the proposed
Java Bridge API..

Implement the thunk!

Next up: implementing `markCrossReferences` in C#!
	dotnet publish --self-contained -p:UseMonoRuntime=true -p:DotNetTargetFramework=net8.0 -p:UseAppHost=true -p:ErrorOnDuplicatePublishOutputFiles=false samples/Hello-Java.Base/Hello-Java.Base.csproj -r osx-x64 && \
	JAVA_INTEROP_GREF_LOG=g.txt  ./samples/Hello-Java.Base/bin/Release/osx-x64/publish/Hello-Java.Base

Does it work?  No quite.

What's present here "works", in that the managed `MarkCrossReferences`
callback *is* invoked, which just keeps all instances alive.

However, it *doesn't* fully work, for reasons I don't understand:
if `MarkCrossReferences` calls managed code, e.g.
`Console.WriteLine("here!")`, then it *hangs*.

Looks like this prototype approach can't work.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant