Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Pruning with logical AND instead of OR #2391

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

vaustrup
Copy link

@vaustrup vaustrup commented Dec 7, 2023

Description

As initially discussed in #2374, being able to chain keywords in the pruning function with a logical AND rather than an OR would be very valuable as it would allow pruning e.g. modifiers only in a certain region.

With this PR, an additional keyword is introduced in the Workspace.prune function: mode which can either be logical_or (default, current behaviour) or logical_and, as suggested by @alexander-held in #2374. A ValueError is raised if a different value is set.

The Workspace.prune function is extended to call Workspace._prune_and_rename separately for each of the keywords (measurements, regions, samples, ...) when mode == 'logical_or', resulting in the same behaviour as before. On the other hand, the logic is more complicated when mode == 'logical_and'. To achieve a logical AND between keywords, Workspace._prune_and_rename is modified.

Before proceeding, we need to agree on the desired behaviour since additional logic may be required. Consider the following cases:

# should prune the 'lumi' parameter in
# channels 'channel1' and 'channel2'
# but keep it in measurements
# if it is used in other channels still
ws.prune(
	channels=["channel1", "channel2"], 
	modifiers=["lumi"],
	mode="logical_and",
)

# should prune the 'lumi' parameter
# in measurement 'measurement1'
# but keep it for all regions and samples
# if it is used in other measurements still
ws.prune(
	modifiers=["lumi"],
	measurements=["measurement1"],
	mode="logical_and",
)

# this should fail because we cannot prune
# samples and channels for certain measurements
ws.prune(
	modifiers=["mod1", "mod2"],
	samples=["process1", "process2"],
	measurements=["measurement1"],
	channels=["channel1", "channel2"],
	mode="logical_and",
)

Should the option of using logical_and also be added to Workspace.rename to e.g. allow renaming modifiers only for a certain sample?

Checklist Before Requesting Reviewer

  • Tests are passing
  • "WIP" removed from the title of the pull request
  • Selected an Assignee for the PR to be responsible for the log summary

Before Merging

For the PR Assignees:

  • Summarize commit messages into a comprehensive review of the PR

@vaustrup vaustrup force-pushed the feat_enhance_pruning branch from aa4a0bb to 0fa5f6c Compare December 8, 2023 08:51
@matthewfeickert matthewfeickert changed the title WIP: Pruning with logical AND instead of OR feat: Pruning with logical AND instead of OR Dec 8, 2023
@vaustrup vaustrup force-pushed the feat_enhance_pruning branch from 0fa5f6c to f231fc3 Compare December 8, 2023 10:51
@vaustrup vaustrup force-pushed the feat_enhance_pruning branch from d92faa4 to 0360984 Compare December 8, 2023 16:21
Copy link
Author

@vaustrup vaustrup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matthewfeickert I think it would be good if someone could take a look at the implementation. Code needs some clean-up still but would be good to confirm this is a good direction to take. The logic is getting a bit messy because whether or not to prune e.g. a sample now depends also on other keywords. It also assumes that the implied hierarchy of keywords (e.g. prune channels in prune_channels only if no samples or modifiers to prune are specified) is intuitive to the user.

Another idea, brought forward by @lorenzennio, was to allow providing strings of the form "//" to prune_modifiers to indicate only in and should be pruned. That way we could do without adding the extra logic to Workspace.prune. I'm not sure it is necessarily cleaner to implement though, and it would still require handling edge cases like removing parameters from measurements if the same parameters have been removed from all samples in the workspace (unless it's ok to give this responsibility to the user?).

Tagging also @alexander-held

prune_channels=channels,
prune_measurements=measurements,
if mode == "logical_and":
if samples != [] and measurements != []:
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure whether these sanity checks should be made here or in _prune_and_rename. Or if they are needed at all for that matter.

prune_channels=channels,
prune_measurements=measurements,
)
return (
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Behaviour of _prune_and_rename was changed to logical_and, so for logical_or we can just chain separate calls for individual keywords.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant