Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

facets: integrate combined_subjects / fix nested subject faceting #1625

Conversation

fenekku
Copy link
Contributor

@fenekku fenekku commented Nov 29, 2023

subjects_qed

@fenekku fenekku force-pushed the 798_faceting_fix_records_resources_integration branch 3 times, most recently from 63e78b0 to 83ccf45 Compare November 29, 2023 23:58
# subject_combined does require a pre-existing change to indexed documents,
# so it's unclear if a direct replacement is right.
# Keeping it around until v13 might be better. On the flipside it is an incorrect
# facet...
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't know what we want to do here:

  • do we replace "subject_nested": {"facet": facets.subject_nested} by "subject_nested": {"facet": facets.subject_combined} directly (or even in facets.py change subject_nested)?
  • or do we do as here which is a more gradual transition (but leaves people with the issue if they don't make the switch)?
  • other takes?

@@ -339,7 +341,7 @@ class RDMDraft(CommonFieldsMixin, Draft):

model_cls = models.RDMDraftMetadata

index = IndexField("rdmrecords-drafts-draft-v6.0.0", search_alias="rdmrecords")
index = IndexField("rdmrecords-drafts-draft-v6.1.0", search_alias="rdmrecords")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going for "6.1.0" instead of 7.0.0 because it only adds to the corresponding jsonschema 6.0.0 . And since Zenodo probably uses 6.0.0 now, we can't just edit that version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mappings names need to align with the jsonschemas so we'll either need a new jsonschema or simply add it to existing v6.0.0. I just did a reindex (on the search cluster) of all of records/drafts zenodo today, so it's fairly ok - about 15min.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Switched back to using 6.0.0, so it will require a re-indexing of Zenodo, but the mapping version will align exactly to the jsonschema version now.

from invenio_records.dumpers import SearchDumperExt


SPLITCHAR = "::" # explict better than implicit
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a little bit weird, but I don't want to rely on defaults to make sure that both CombinedTermsFacet and CombinedSubjectsDumperExt use the same splitchar (which should really be called splitstr, but other Facets use splitchar, so I kept with it). It's better to be explicit.

def deprecated_subject_nested():
"""Deprecated NestedTermsFacet.

Will warn until this is completely removed.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, not ideal, but if we go the "soft" transition route of letting subject_nested around, (so that re-indexing can be done on administrator's schedule), having a deprecation warning is a good way to alert the administrator. It's unfortunate that it will alert even after the switch is made, but there isn't really an alternative (unless there is no deprecation warning) until it's removed.

@fenekku fenekku force-pushed the 798_faceting_fix_records_resources_integration branch from 83ccf45 to ca25cfa Compare December 18, 2023 19:37
@fenekku fenekku requested a review from lnielsen December 18, 2023 19:39
Copy link
Contributor

This PR was automatically marked as stale.

@github-actions github-actions bot added the stale label Feb 17, 2024
@fenekku fenekku removed the stale label Feb 19, 2024
@fenekku fenekku force-pushed the 798_faceting_fix_records_resources_integration branch from ca25cfa to dceb433 Compare April 11, 2024 16:25
@fenekku fenekku force-pushed the 798_faceting_fix_records_resources_integration branch from dceb433 to eeda5fb Compare April 11, 2024 18:45
@fenekku fenekku merged commit f2116da into inveniosoftware:master Apr 11, 2024
4 checks passed
@fenekku fenekku deleted the 798_faceting_fix_records_resources_integration branch April 11, 2024 20:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

subject_nested facet issue Subjects are improperly aggregated
2 participants