Skip to content

Conversation

@mehmet-karaman
Copy link
Contributor

@mehmet-karaman mehmet-karaman commented Oct 13, 2025

  • This PR will fix multithreading issues described in Sporadically thrown NPE and NSEE during issue processing after validation. eclipse-xtext/xtext#3524
  • added synchronize block to methods in AnnotationModel, wherever reads / writes could happen in multiple threads.
  • changed execution order, to be able to synchronize only necessary code blocks.
  • Removed dead code in AnnotationModel.cleanup because mapLock can't be null.
  • Added new javadoc to IAnnotationMap.getLockObject().
  • added assert.isLegal check to AnnotationMAp.setLockObject()

@laeubi
Copy link
Contributor

laeubi commented Oct 13, 2025

@mehmet-karaman can you please more explain the rationale behind the changes? I think we can assume code is there for a reason so if we change fundamental things we should carefully describe the reasons and causes and why it is not a bug of the caller for example as otherwise we might run into deadlocks if code that previously run outside locks now run under lock conditions.

@iloveeclipse
Copy link
Member

The change here is coming from eclipse-xtext/xtext#3524 investigation.

@iloveeclipse
Copy link
Member

@szarnekow : if you have time, would be good if you could check this PR.

@laeubi
Copy link
Contributor

laeubi commented Oct 13, 2025

It still would be good to more explain the individual changes and why they are required / needed. Also I think some changes are better a separate PR (as they are easier to review then e.g. "Removed dead code" or API documentation enhancements)

@iloveeclipse
Copy link
Member

Also I think some changes are better a separate PR

Could you please point which exactly?

@laeubi
Copy link
Contributor

laeubi commented Oct 13, 2025

Each of those can be an own PR

  • Removed dead code in AnnotationModel.cleanup because mapLock can't be null.
  • Added new javadoc to IAnnotationMap.getLockObject()
  • added assert.isLegal check to AnnotationMAp.setLockObject()

All of these seem independent and local enough to be quickly reviewed and merged and likely will improve things without risk for regression. And even if they are easier to revert in isolation.

Then might be the next thing

  • changed execution order, to be able to synchronize only necessary code blocks.

and finally

  • added synchronize block to methods in AnnotationModel, wherever reads / writes could happen in multiple threads.

This will make each PR small, focused and likely better to understand the implications and its easier to write a concise commit message.

@iloveeclipse
Copy link
Member

Each of those can be an own PR

  • Removed dead code in AnnotationModel.cleanup because mapLock can't be null.
  • Added new javadoc to IAnnotationMap.getLockObject()
  • added assert.isLegal check to AnnotationMAp.setLockObject()

I believe these could be seperated from the main PR, but should go in one PR, because they all require each other.

@laeubi
Copy link
Contributor

laeubi commented Oct 13, 2025

I believe these could be seperated from the main PR, but should go in one PR, because they all require each other.

Sure at laest I think these are more "cleanup" and currently make the PR harder to review than it should.

@github-actions
Copy link
Contributor

github-actions bot commented Oct 13, 2025

Test Results

 3 018 files  ±0   3 018 suites  ±0   5h 20m 49s ⏱️ + 2h 46m 13s
 8 229 tests ±0   7 980 ✅ ±0  249 💤 ±0  0 ❌ ±0 
23 607 runs  ±0  22 813 ✅ ±0  794 💤 ±0  0 ❌ ±0 

Results for commit 31c83fd. ± Comparison against base commit 0ce1430.

♻️ This comment has been updated with latest results.

@iloveeclipse iloveeclipse force-pushed the fix_multithreading_problems_in_annotations branch 2 times, most recently from 34fc4f0 to 4082003 Compare October 16, 2025 15:43
@iloveeclipse
Copy link
Member

I've rebased on #3399 state, fixed merge conflicts & my own comment above.

@iloveeclipse iloveeclipse enabled auto-merge (rebase) October 16, 2025 16:36
@iloveeclipse iloveeclipse disabled auto-merge October 16, 2025 16:39
@iloveeclipse iloveeclipse force-pushed the fix_multithreading_problems_in_annotations branch 2 times, most recently from f3ef6e9 to e5d2191 Compare October 17, 2025 07:43
@iloveeclipse
Copy link
Member

Test failures on Windows are unrelated.

@mehmet-karaman, @laeubi : please feel free to review.

Copy link
Contributor

@laeubi laeubi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks ok in general but I have not testes it.

@mehmet-karaman mehmet-karaman force-pushed the fix_multithreading_problems_in_annotations branch from e5d2191 to 703a1ad Compare October 20, 2025 07:23
@mehmet-karaman
Copy link
Contributor Author

mehmet-karaman commented Oct 20, 2025

Resolved two remaining comments and rebased on latest master.

@iloveeclipse
Copy link
Member

@mehmet-karaman : you have now two commits, I assume by mistake. Can you please have one?

@mehmet-karaman mehmet-karaman force-pushed the fix_multithreading_problems_in_annotations branch from 703a1ad to 5fa4d84 Compare October 20, 2025 07:42
@mehmet-karaman
Copy link
Contributor Author

squashed the two commits.

@laeubi laeubi requested a review from Copilot October 20, 2025 08:08
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses multithreading issues in the AnnotationModel class by adding proper synchronization to prevent race conditions during concurrent access to annotation data. The changes ensure thread safety while maintaining performance by minimizing the scope of synchronized blocks.

Key changes:

  • Added synchronized blocks around critical sections that access shared data structures
  • Reorganized execution order to minimize time spent in synchronized blocks
  • Removed dead code and added validation checks

- added synchronize block to methods in AnnotationModel, wherever reads
/ writes could happen in multiple threads.
- changed execution order, to be able to synchronize only necessary code
blocks.

Co-authored-by: Andrey Loskutov <[email protected]>
@iloveeclipse iloveeclipse force-pushed the fix_multithreading_problems_in_annotations branch from 5fa4d84 to 31c83fd Compare October 20, 2025 09:56
Copy link
Member

@iloveeclipse iloveeclipse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Funny enough, the very first time I've started IDE from debugger with this patch, IDE was frozen immediately with this deadlock:

"main":
	at org.eclipse.jface.text.source.AnnotationModel.getRegionAnnotationIterator(AnnotationModel.java:765)
	- waiting to lock <0x0000000088000000> (a java.lang.Object)
	at org.eclipse.jface.text.source.AnnotationModel.getAnnotationIterator(AnnotationModel.java:722)
	at org.eclipse.jface.text.source.inlined.InlinedAnnotationSupport$UpdateStylesWidth.applyTextPresentation(InlinedAnnotationSupport.java:108)
	at org.eclipse.jface.text.TextViewer.changeTextPresentation(TextViewer.java:4828)
	at org.eclipse.jdt.internal.ui.javaeditor.SemanticHighlightingPresenter.updatePresentation(SemanticHighlightingPresenter.java:245)
	at org.eclipse.jdt.internal.ui.javaeditor.SemanticHighlightingPresenter.lambda$0(SemanticHighlightingPresenter.java:149)
	at org.eclipse.jdt.internal.ui.javaeditor.SemanticHighlightingPresenter$$Lambda/0x000000001ca289b0.run(Unknown Source)
	at org.eclipse.jdt.internal.ui.javaeditor.SemanticHighlightingReconciler.lambda$3(SemanticHighlightingReconciler.java:670)
	- locked <0x0000000088034398> (a java.lang.Object)
	at org.eclipse.jdt.internal.ui.javaeditor.SemanticHighlightingReconciler$$Lambda/0x000000001ca28be0.run(Unknown Source)
	at org.eclipse.swt.widgets.RunnableLock.run(RunnableLock.java:40)
	at org.eclipse.swt.widgets.Synchronizer.runAsyncMessages(Synchronizer.java:132)
	- locked <0x0000000088000010> (a org.eclipse.swt.widgets.RunnableLock)
	at org.eclipse.swt.widgets.Display.runAsyncMessages(Display.java:5079)
	at org.eclipse.swt.widgets.Display.readAndDispatch(Display.java:4548)
	at org.eclipse.jface.window.Window.runEventLoop(Window.java:824)
	at org.eclipse.jface.window.Window.open(Window.java:804)
	at org.eclipse.search.internal.ui.OpenSearchDialogAction.run(OpenSearchDialogAction.java:60)
	at org.eclipse.search.ui.NewSearchUI.openSearchDialog(NewSearchUI.java:304)
	at org.eclipse.search.internal.ui.OpenFileSearchPageAction.run(OpenFileSearchPageAction.java:51)
	at org.eclipse.ui.internal.PluginAction.runWithEvent(PluginAction.java:239)
	at org.eclipse.ui.internal.WWinPluginAction.runWithEvent(WWinPluginAction.java:218)
	at org.eclipse.jface.action.ActionContributionItem.handleWidgetSelection(ActionContributionItem.java:581)
	at org.eclipse.jface.action.ActionContributionItem.lambda$4(ActionContributionItem.java:415)
	at org.eclipse.jface.action.ActionContributionItem$$Lambda/0x000000001c4815b0.handleEvent(Unknown Source)
	at org.eclipse.swt.widgets.EventTable.sendEvent(EventTable.java:91)
	at org.eclipse.swt.widgets.Display.sendEvent(Display.java:5889)
	at org.eclipse.swt.widgets.Widget.sendEvent(Widget.java:1656)
	at org.eclipse.swt.widgets.Display.runDeferredEvents(Display.java:5104)
	at org.eclipse.swt.widgets.Display.readAndDispatch(Display.java:4545)
	at org.eclipse.e4.ui.internal.workbench.swt.PartRenderingEngine$5.run(PartRenderingEngine.java:1147)
	at org.eclipse.core.databinding.observable.Realm.runWithDefault(Realm.java:339)
	at org.eclipse.e4.ui.internal.workbench.swt.PartRenderingEngine.run(PartRenderingEngine.java:1038)
	at org.eclipse.e4.ui.internal.workbench.E4Workbench.createAndRunUI(E4Workbench.java:153)
	at org.eclipse.ui.internal.Workbench.lambda$3(Workbench.java:677)
	at org.eclipse.ui.internal.Workbench$$Lambda/0x000000001c1b7658.run(Unknown Source)
	at org.eclipse.core.databinding.observable.Realm.runWithDefault(Realm.java:339)
	at org.eclipse.ui.internal.Workbench.createAndRunWorkbench(Workbench.java:583)
	at org.eclipse.ui.PlatformUI.createAndRunWorkbench(PlatformUI.java:173)
	at org.eclipse.ui.internal.ide.application.IDEApplication.start(IDEApplication.java:185)
	at org.eclipse.equinox.internal.app.EclipseAppHandle.run(EclipseAppHandle.java:219)
	at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.runApplication(EclipseAppLauncher.java:149)
	at org.eclipse.core.runtime.internal.adaptor.EclipseAppLauncher.start(EclipseAppLauncher.java:115)
	at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:467)
	at org.eclipse.core.runtime.adaptor.EclipseStarter.run(EclipseStarter.java:298)
	at java.lang.invoke.DirectMethodHandle$Holder.invokeStatic(java.base@25/DirectMethodHandle$Holder)
	at java.lang.invoke.LambdaForm$MH/0x000000001c04f800.invoke(java.base@25/LambdaForm$MH)
	at java.lang.invoke.LambdaForm$MH/0x000000001c04fc00.invokeExact_MT(java.base@25/LambdaForm$MH)
	at jdk.internal.reflect.DirectMethodHandleAccessor.invokeImpl(java.base@25/DirectMethodHandleAccessor.java:156)
	at jdk.internal.reflect.DirectMethodHandleAccessor.invoke(java.base@25/DirectMethodHandleAccessor.java:104)
	at java.lang.reflect.Method.invoke(java.base@25/Method.java:565)
	at org.eclipse.equinox.launcher.Main.invokeFramework(Main.java:615)
	at org.eclipse.equinox.launcher.Main.basicRun(Main.java:563)
	at org.eclipse.equinox.launcher.Main.run(Main.java:1415)
	at org.eclipse.equinox.launcher.Main.main(Main.java:1387)
"Worker-10: Override indicator installation job":
	at org.eclipse.core.internal.filebuffers.SynchronizableDocument.addPosition(SynchronizableDocument.java:306)
	- waiting to lock <0x0000000088034398> (a java.lang.Object)
	at org.eclipse.jface.text.AbstractDocument.addPosition(AbstractDocument.java:364)
	at org.eclipse.jface.text.source.AnnotationModel.addPosition(AnnotationModel.java:492)
	at org.eclipse.jface.text.source.AnnotationModel.addAnnotation(AnnotationModel.java:462)
	at org.eclipse.jface.text.source.AnnotationModel.replaceAnnotations(AnnotationModel.java:432)
	at org.eclipse.jface.text.source.AnnotationModel.replaceAnnotations(AnnotationModel.java:401)
	at org.eclipse.jdt.internal.ui.javaeditor.OverrideIndicatorManager.updateAnnotations(OverrideIndicatorManager.java:218)
	- locked <0x0000000088000000> (a java.lang.Object)
	at org.eclipse.jdt.internal.ui.javaeditor.OverrideIndicatorManager.reconciled(OverrideIndicatorManager.java:261)
	at org.eclipse.jdt.internal.ui.javaeditor.ClassFileEditor$1.run(ClassFileEditor.java:796)
	at org.eclipse.core.internal.jobs.Worker.run(Worker.java:63)

Found 1 deadlock.

So having all the tests green doesn't mean we are fine :-)

@iloveeclipse
Copy link
Member

This is javadoc from org.eclipse.jface.text.source.AnnotationModel

All modifications of the model's internal annotation map are synchronized using the model's lock object.

This is javadoc from org.eclipse.jface.text.ISynchronizable:

In order to reduce the probability of dead locks clients should synchronize their access to these objects by using the provided lock object

So the code in AnnotationModel "can" use lock object returned by getLockObject() only to protect the annotation map but not the model itself! OMG!

So existing clients (like seen in the deadlock above) can freely use model lock object to lock at any time on any thread outside the model code. The problem is built-in "by design". No one should use object internal lock from outside to synchronize on it as org.eclipse.jface.text.ISynchronizable prpoposes, because there is no guarantee what the client code does and which locks it will acquire while hoding object internal lock...

I don't know why that was designed this way, but this is an invite for trouble, anti-pattern of MT code.

In the example deadlock we see that

  • OverrideIndicatorManager.updateAnnotations() obtains a lock on the model, model calls document.addPosition() that waits on a lock on document
  • SemanticHighlightingReconciler.updatePresentation() obtains a lock on the document object and waits on a lock to the model in getRegionAnnotationIterator().

The problem is that client code (as seen in the deadlock stack) can hold other locks while calling into model, and model code can request locks to code "outside" in multiple places like interacting with document or sending change events.

In the PR beside other changes we've added extra synchronization (via getLockObject()) in following methods that didn't acquired lock before:

  • getRegionAnnotationIterator()
  • getAnnotationIterator()
  • addAnnotationModel()
  • removeAnnotationModel()

These locks were added because code inside can be affected by MT changes on internal data fAttachments and fPositions.

The deadlock we see happened in getRegionAnnotationIterator(). However, it could happen in any of hte four methods above, and it is just matter of time we might see another deadlock if using getLockObject().

One solution would be to change (break) API and do not expose internal object lock - but this would require refactoring of all clients code, which is simply not possible.

Another solution would be to introduce yet another lock object to protect AnnotationModel itself from concurrent modifications. This was avoided in 2619fb4 that tried to deal with same problems by using ConcurrentHashMap, but the ConcurrentHashMap only garantees internal consistency, not the consistency of iteration operations on the map like fAttachments.get(it.next().

But intoducing yet another lock object would not reduce MT problems, it would rather increase the deadlock possibility, given the current code state.

So if we would make the code in AnnotationModel properly synchronized it will deadlock soooner or later!

So back to the original problem with exceptions like in eclipse-xtext/xtext#3524 - if we can't make AnnotationModel thread safe by design, which other choises we have?

This leads us to possible workaround (I can't call it solution anymore): while using fAttachments and fPositions, don't use iterators but iterate over the map only via one from multiple ConcurrentHashMap.forEach... methods, which probably could fix eclipse-xtext/xtext#3524 but of course would never guarantee any real consistency of the underlined data (which is also the case already). Also addAnnotationModel() is "special" as it checks for containsValue() before inserting something, so the code probably should be rewritten via forEachValue() ... Non trivial too.

And the last workaround I can imagine would be: since we know the code in twe methods above that requires locks needs locks or otherwise would run into troubles, we could try to retry operations inside the methods if an exceptions like NoSuchElementException or NullPointerException happens. something like:

	private Iterator<Annotation> getAnnotationIterator(boolean cleanup, boolean recurse) {
		Iterator<Annotation> iter= getAnnotationIterator(cleanup);
		if (!recurse || fAttachments.isEmpty()) {
			return iter;
		}
		List<Iterator<Annotation>> iterators = null;
		while (iterators == null) {
			try {
				iterators= getAnnotationIterator(iter);  // can't use lock here...
			} catch (NullPointerException | NoSuchElementException e) {
				// ignore and retry
			}
		}
		return new MetaIterator<>(iterators.iterator());
	}
	private List<Iterator<Annotation>> getAnnotationIterator(Iterator<Annotation> iter) {
		List<Iterator<Annotation>> iterators= new ArrayList<>(fAttachments.size() + 1);
		iterators.add(iter);
		Iterator<Object> it= fAttachments.keySet().iterator();
		while (it.hasNext()) {
			iterators.add(fAttachments.get(it.next()).getAnnotationIterator());
		}
		return iterators;
	}

It will not make anything thread safe, but clients already deal with not thread safe implemantation of AnnotationModel, they just will not need to catch exceptions like catch (NullPointerException | NoSuchElementException e) in their own code...

So there are now two workarounds possible ut no real solution in sight. I would probably first look into removing added locks in the methods below and replacing whatever operations were used to concurrent operations provided by one of ConcurrentHashMap.forEach... methods and similar.

  • getRegionAnnotationIterator()
  • getAnnotationIterator()
  • addAnnotationModel()
  • removeAnnotationModel()

I believe other code that is changed in this PR could stay changed as proposed, but I need more time to understand all implications considering the four methods above can't be made MT safe anymore.

@laeubi
Copy link
Contributor

laeubi commented Oct 20, 2025

It feels from the description that one needs a complete rework of the data model instead... in any case I think the idea with the lock object is/was that one can prevent modifications or inhibit events accross multiple objects.

Regarding the Xtext case, maybe the caller should already lock on the lock object itself when it enters a critical section then and be carefull to not hold any additional locks?

@mehmet-karaman
Copy link
Contributor Author

Seems this wasn't the first approach to improve the AnnotationModel.

#892

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants