Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue #11189 part 00 refactor citation relation tab logic #11845

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

alexandre-cremieux
Copy link
Contributor

@alexandre-cremieux alexandre-cremieux commented Sep 28, 2024

Refs #11189

This contributions aims to simplify the citations/references fetching and caching logic by introducing two layers:

  • service
  • repository

This should help to make this feature more extendable without modifying orchestration logic following open/close principle.

Also, this PR will allow to introduce a new caching logic in coming PR.

Missing requirements for merging will come after draft review.

Mandatory checks

  • Change in CHANGELOG.md described in a way that is understandable for the average user (if applicable)
  • Tests created for changes (if applicable)
  • Manually tested changed features in running JabRef (always required)
  • Screenshots added in PR description (for UI changes)
  • Checked developer's documentation: Is the information available and up to date? If not, I outlined it in this pull request.
  • Checked documentation: Is the information available and up to date? If not, I created an issue at https://github.com/JabRef/user-documentation/issues or, even better, I submitted a pull request to the documentation repository.

@alexandre-cremieux alexandre-cremieux marked this pull request as draft September 28, 2024 20:38
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your code currently does not meet JabRef's code guidelines.
We use Checkstyle to identify issues.
The tool reviewdog already placed comments on GitHub to indicate the places. See the tab "Files" in you PR.
Please carefully follow the setup guide for the codestyle.
Afterwards, please run checkstyle locally and fix the issues.

You can check review dog's comments at the tab "Files changed" of your pull request.

@alexandre-cremieux alexandre-cremieux force-pushed the fix-issue-11189-part-00-refactor-citation-relation-tab-logic branch from 9fe8522 to cbe9e96 Compare September 28, 2024 21:15
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your code currently does not meet JabRef's code guidelines.
We use Checkstyle to identify issues.
The tool reviewdog already placed comments on GitHub to indicate the places. See the tab "Files" in you PR.
Please carefully follow the setup guide for the codestyle.
Afterwards, please run checkstyle locally and fix the issues.

You can check review dog's comments at the tab "Files changed" of your pull request.

@alexandre-cremieux alexandre-cremieux force-pushed the fix-issue-11189-part-00-refactor-citation-relation-tab-logic branch from cbe9e96 to 8231340 Compare September 28, 2024 21:51
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your code currently does not meet JabRef's code guidelines.
We use OpenRewrite to ensure "modern" Java coding practices.
The issues found can be automatically fixed.
Please execute the gradle task rewriteRun, check the results, commit, and push.

You can check the detailed error output by navigating to your pull request, selecting the tab "Checks", section "Tests" (on the left), subsection "OpenRewrite".

@alexandre-cremieux alexandre-cremieux force-pushed the fix-issue-11189-part-00-refactor-citation-relation-tab-logic branch from 8231340 to 33967c2 Compare September 28, 2024 22:20
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your code currently does not meet JabRef's code guidelines.
We use OpenRewrite to ensure "modern" Java coding practices.
The issues found can be automatically fixed.
Please execute the gradle task rewriteRun, check the results, commit, and push.

You can check the detailed error output by navigating to your pull request, selecting the tab "Checks", section "Tests" (on the left), subsection "OpenRewrite".

@alexandre-cremieux alexandre-cremieux force-pushed the fix-issue-11189-part-00-refactor-citation-relation-tab-logic branch from 33967c2 to 592d4d7 Compare September 28, 2024 22:47
@koppor koppor changed the title Fix issue 11189 part 00 refactor citation relation tab logic Fix issue #11189 part 00 refactor citation relation tab logic Sep 29, 2024
@alexandre-cremieux alexandre-cremieux force-pushed the fix-issue-11189-part-00-refactor-citation-relation-tab-logic branch 2 times, most recently from d94f4d3 to 3155242 Compare September 29, 2024 13:24
Copy link
Contributor Author

@alexandre-cremieux alexandre-cremieux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add code explanations to the PR


import org.eclipse.jgit.util.LRUMap;

public class BibEntryRelationsCache {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed and tested. Also, we were not merging the relations but overwriting them when cacheOrMerge... was called (see code and fix).

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class BibEntryRelationsRepository {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed and tested.

@@ -71,7 +74,7 @@ public class CitationRelationsTab extends EntryEditorTab {
private final GuiPreferences preferences;
private final LibraryTab libraryTab;
private final TaskExecutor taskExecutor;
private final BibEntryRelationsRepository bibEntryRelationsRepository;
private final SearchCitationsRelationsService searchCitationsRelationsService;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introduces a service layer that segregates the fetching and repository logic definitions.

.onSuccess(fetchedList -> onSearchForRelationsSucceed(entry, listView, abortButton, refreshButton,
searchType, importButton, progress, fetchedList, observableList))
this.createBackGroundTask(entry, searchType, shouldRefresh)
.consumeOnRunning(task -> prepareToSearchForRelations(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could probably be renamed applyOnRunning(Consumer<Task> consumer).

) {
return switch (searchType) {
case CitationFetcher.SearchType.CITES -> {
citingTask = BackgroundTask.wrap(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not really appreciate this solution. The method should return a Callable instead of BackGroundTask and it should not be possible to restart a search if one is already running for same tab. I propose to refactor this in a next PR but lets focus on the cache refactoring first. For now logic is same as before.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK for me.

You can add a TODO comment if you want.

private static final Map<DOI, Set<BibEntry>> REFERENCES_MAP = new LRUMap<>(MAX_CACHED_ENTRIES, MAX_CACHED_ENTRIES);

public List<BibEntry> getCitations(BibEntry entry) {
return entry
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returns a copy now.

.toList();
}

public void cacheOrMergeCitations(BibEntry entry, List<BibEntry> citations) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method used to rewrite data, now it does merge inputs according the method name.


import org.jabref.model.entry.BibEntry;

public class LRUBibEntryRelationsRepository implements BibEntryRelationsRepository {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the fetcher logic from previous implementation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why LRUBibEntryRelationsCache and LRUBibEntryRelationsRepository are separated? Can't they be in one class LRUBibEntryRelationsRepository?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not super sure of the final implementation yet. I will take the emerging model cause I would like to try to be able to chain caches (something like one for the disk, one in memory...).

However the cache was already dissociated from the repository in previous implementation. The cache is for now making use of static fields and I clearly prefer to avoid referencing them from the repository. The cache would preferably be a singleton while repository should be dedicated to each tab instance.

Also, I guess that the repository here serves as an adapter between the domain code and the low level logic (if any).

Finally that way, I am sure that we can re-use the repository for test without having to instantiate the cache itself if needed.

@@ -1,4 +1,4 @@
package org.jabref.gui.entryeditor.citationrelationtab.semanticscholar;
package org.jabref.logic.importer.fetcher;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this to the logic package, Fecthing is more like a back-end process - should belong to an adapter layer.

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class SearchCitationsRelationsService {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fetching and Repository logic can now be injected in orchestration logic that should not vary in next PR.

This should also make it possible to configure a citation search service based on the execution context. This approach can also enable new features like offering the user the possibility to choose between multiple fetchers targeting another online search engine.

@Siedlerchr
Copy link
Member

Please no force push if not needed. All commits will be squashed when merged

* Move repository, cache, and fetcher to logic package
* Move citations model to model/citations/semanticscholar package
* Introduce service layer
* Rename LRU cache implementation
* Add tests helpers for repository
* Move logic from repository to service
* Refactor repositories
* Update tab configuration
@alexandre-cremieux alexandre-cremieux force-pushed the fix-issue-11189-part-00-refactor-citation-relation-tab-logic branch from 3155242 to 18db75e Compare September 29, 2024 15:01
@alexandre-cremieux
Copy link
Contributor Author

Please no force push if not needed. All commits will be squashed when merged

Sorry, just re-based main branch locally.

…lation-tab-logic

# Conflicts:
#	src/main/java/org/jabref/gui/entryeditor/citationrelationtab/CitationRelationsTab.java
Copy link
Member

@koppor koppor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general looks good. Some minor comments.

Sorry for delay. Please go ahead with everything.

@@ -0,0 +1,60 @@
package org.jabref.logic.citation.service;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to create a package for for a single class. The class can reside into the package org.jabref.logic.citation.

Comment on lines 443 to 445
new Label(Localization.lang(
"Error while fetching citing entries: %0", exception.getMessage())
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not reformat. Our tooling cannot deal with that - see https://devdocs.jabref.org/code-howtos/localization.html for some hints.

) {
return switch (searchType) {
case CitationFetcher.SearchType.CITES -> {
citingTask = BackgroundTask.wrap(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK for me.

You can add a TODO comment if you want.

}
return List.of();
},
null
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about the "null" here. But I think, it is OK for now.

We want to go away with nulls in JabRef. If we have it, we annotate with jspecify. But in tests, its ok.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is just an on purpose mock, it should not be used from the application code.

I agree with you, null is an open door to bad experiences... like de mocks frameworks sometimes (or often).

@koppor koppor marked this pull request as ready for review October 10, 2024 20:29
@koppor
Copy link
Member

koppor commented Oct 10, 2024

Small other comments - IntelliJ proposed to extract a method

private static SearchCitationsRelationsService getSearchCitationsRelationsService(BibEntry cited, List<BibEntry> citationsToReturn, BibEntryRelationsRepository citationsToReturn1) {

in the tests - maybe you can also include that.

@koppor
Copy link
Member

koppor commented Oct 10, 2024

@alexandre-cremieux Please pull before you continue working on it - I merged main for you (and resolved conflicts).


import org.jabref.model.entry.BibEntry;

public class LRUBibEntryRelationsRepository implements BibEntryRelationsRepository {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why LRUBibEntryRelationsCache and LRUBibEntryRelationsRepository are separated? Can't they be in one class LRUBibEntryRelationsRepository?

var errMsg = "Error while fetching references for entry %s".formatted(
referencer.getTitle()
);
LOGGER.error(errMsg);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Errors should be handled a little bit other way.

Like this:

LOGGER.error("Error while fetching references for entry %0", references.getTitle(), e)

(Hope I haven't missed the syntax and parameters)

So you see:

  • Error should be the last argument, so that we have the full information.
  • And LOGGER could be parametrized

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Logger, the syntax for placeholders is {}. %0 is used in the localization.

var errMsg = "Error while fetching citations for entry %s".formatted(
cited.getTitle()
);
LOGGER.error(errMsg);
Copy link
Collaborator

@InAnYan InAnYan Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

Besides, could there be some kind of special error type, @koppor, or we can leave it as Exception?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, didn't see that. A new exception inheriting from org.jabref.logic.JabRefException should be introduced.

@alexandre-cremieux
Copy link
Contributor Author

@alexandre-cremieux Please pull before you continue working on it - I merged main for you (and resolved conflicts).

Thanks for the review and the merge. I will resume the work on this branch and apply the changes.

@koppor
Copy link
Member

koppor commented Nov 8, 2024

@alexandre-cremieux Sorry for the merge conflicts - can you handle them? I was always happy with IntelliJ's "resolve merge conflicts" dialog. Hope, it works in this case, too.

@alexandre-cremieux
Copy link
Contributor Author

alexandre-cremieux commented Nov 8, 2024

@alexandre-cremieux Sorry for the merge conflicts - can you handle them? I was always happy with IntelliJ's "resolve merge conflicts" dialog. Hope, it works in this case, too.

Hello @koppor . Seems that we have new conflicts to resolve to be able to merge main. But I will do that when the feature will be fully developed. Was quite busy last month, I resumed the work this week. PR comments were addressed.

@koppor
Copy link
Member

koppor commented Nov 11, 2024

@alexandre-cremieux Sorry for the merge conflicts - can you handle them? I was always happy with IntelliJ's "resolve merge conflicts" dialog. Hope, it works in this case, too.
Hello @koppor . Seems that we have new conflicts to resolve to be able to merge main. But I will do that when the feature will be fully developed. Was quite busy last month, I resumed the work this week. PR comments were addressed.

Good to hear. - I think, huge changes won't be done in main the next weeks. Thus, it would be a good idea to merge main. This would enable the CI to run. I could try to merge if I have time.

@alexandre-cremieux
Copy link
Contributor Author

Hello @koppor . Thanks for your answer.

Please do not merge main, there is a discussion we should probably have before going further.

Working on the MVStore implementation, it became more clear that an ad-hoc serialization was needed to be able to store the BibEntry to disk and the way it should be done make me understand that the BibEntry type is maybe not the most suitable one to address the CitationTab use case.

You can find the related tests I wrote : commit

I will expose a deeper analysis and a proposal on the issue page for us to be able to discuss the design, here: #11189

@koppor
Copy link
Member

koppor commented Nov 11, 2024

Working on the MVStore implementation, it became more clear that an ad-hoc serialization was needed to be able to store the BibEntry to disk

Why do you need the full BibEntry on disk?

In the AI functionality, we just used the citation key - see https://github.com/JabRef/jabref/blob/main/docs/decisions/0034-use-citation-key-for-grouping-chat-messages.md. I think, you need to persist information accross sessions.

and the way it should be done make me understand that the BibEntry type is maybe not the most suitable one to address the CitationTab use case.

Ah, I think, you want to store the BibEntries NOT contained in the current library. Since the result of the server is a BibEntry (isn't it?), it is the right data type?

I will expose a deeper analysis and a proposal on the issue page for us to be able to discuss the design, here: #11189

If the design is close to the code, discuss here. The issue is more user-facing.

@alexandre-cremieux
Copy link
Contributor Author

alexandre-cremieux commented Nov 12, 2024

Why do you need the full BibEntry on disk?

Hello. I do not need to save the full BibEntry to disk. It is the main concern: as we do not need the full set of fields of a BibEntry to represent a citation relation then why do we use BibEntry structure to represent the citation relation ?

Ah, I think, you want to store the BibEntries NOT contained in the current library. Since the result of the server is a BibEntry (isn't it?), it is the right data type?

In fact my proposal was more to use a dedicated data structure to citation relation to store them rather than the BibEntry (the data structure should off course belong to JabRef itself). Main logic will not change.

@koppor
Copy link
Member

koppor commented Nov 13, 2024

Hello. I do not need to save the full BibEntry to disk. It is the main concern: as we do not need the full set of fields of a BibEntry to represent a citation relation then why do we use BibEntry structure to represent the citation relation ?

A BibEntry does not enforce to store all fields.

JabRef has all the logic to insert a BibEntry into a library, check duplicates, ... all based on BibEntry. It can also render a BibEntry based on the selected preview style (which the current csl lib maybe does not do, but it should for consistency reasons).

I know that entry.getField(StandardField.TITLE) reads strange. It is OK for me to introduce methods such as org.jabref.model.entry.BibEntry#getTitle for readability.

@alexandre-cremieux
Copy link
Contributor Author

alexandre-cremieux commented Nov 13, 2024

Hello. I do not need to save the full BibEntry to disk. It is the main concern: as we do not need the full set of fields of a BibEntry to represent a citation relation then why do we use BibEntry structure to represent the citation relation ?

A BibEntry does not enforce to store all fields.

JabRef has all the logic to insert a BibEntry into a library, check duplicates, ... all based on BibEntry. It can also render a BibEntry based on the selected preview style (which the current csl lib maybe does not do, but it should for consistency reasons).

I know that entry.getField(StandardField.TITLE) reads strange. It is OK for me to introduce methods such as org.jabref.model.entry.BibEntry#getTitle for readability.

Hello @koppor

Thanks for the reply. I thought also about the comparator as you suggested on the issue page, this solution should satisfy our need (even if I would have prefer to separate the two contexts). I will resume the work using BibEntry to store the citation in the MVStore as this is agreed between us, and rely on JabRef's logic for duplication search.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants