Skip to content

Conversation

@ycexiao
Copy link
Collaborator

@ycexiao ycexiao commented Sep 17, 2025

What problem does this PR address?

Implement some of the functionalities mentioned in #256

(1) --check-increase option is added for squeeze morph. When it is applied, the non-strictly-increasing x will raise an error; otherwise, it will only throw warnings.

(2) Functions to sort sqeezed x and to remove the duplicate x values are implemented inside morphsqueeze.py.

(3) Tests for the error and warning behavior are added.

What should the reviewer(s) do?

Please check the implementations.

@codecov
Copy link

codecov bot commented Sep 17, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.93%. Comparing base (f5fc9ac) to head (4910a65).

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #259   +/-   ##
=======================================
  Coverage   99.92%   99.93%           
=======================================
  Files          24       24           
  Lines        1398     1471   +73     
=======================================
+ Hits         1397     1470   +73     
  Misses          1        1           
Files with missing lines Coverage Δ
tests/test_morphsqueeze.py 100.00% <100.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ycexiao
Copy link
Collaborator Author

ycexiao commented Sep 17, 2025

Tests for the expected squeeze morph outcome with non-strictly-increasing x_squeezed are left. Since this needs more experience in identifying what morphed outcome is "good enough", @Sparks29032, can you maybe code this part in another PR?

@ycexiao ycexiao marked this pull request as ready for review September 17, 2025 21:17
@ycexiao
Copy link
Collaborator Author

ycexiao commented Sep 17, 2025

@sbillinge @Sparks29032, it's ready for review.

Comment on lines 81 to 85
"Squeezed grid is not strictly increasing."
"Please (1) decrease the order of your polynomial and "
"(2) ensure that the initial polynomial morph result in "
"good agreement between your reference and "
"objective functions."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the discussion in #248 to what we want to change this error message to.

@Sparks29032
Copy link
Collaborator

Tests for the expected squeeze morph outcome with non-strictly-increasing x_squeezed are left. Since this needs more experience in identifying what morphed outcome is "good enough", @Sparks29032, can you maybe code this part in another PR?

@ycexiao

I would rather not merge this if it is not well-tested. At least do the first test case in this PR. This one doesn't require "more experience".

Take a current squeeze morph test (e.g. the one Luis wrote with the sine function and small morphs on the order of 1e-3). Then, run the same squeeze morph with initial conditions say 0.1, 0.1, 0.1. When the --check-increase function is enabled, it should fail. Take off the --check-increase function and see if we converge to the same value.

@Sparks29032
Copy link
Collaborator

@ycexiao Can you send screenshots of the outputs?

Copy link
Collaborator Author

@ycexiao ycexiao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Sparks29032, it's ready for review.

if list(x) != list(x_sorted):
if self.check_increase:
raise ValueError(
"Error: The polynomial applied by the squeeze morph has "
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message is updated according to #248

@ycexiao
Copy link
Collaborator Author

ycexiao commented Sep 17, 2025

without --check-increase
image
moroh_x in the file ranges from 0 to 10. So in [-16, 0] the x and squeezed x are overlapped.

with --check-increase
image

Comment on lines 114 to 136
def _get_overlapping_regions(self, x):
diffx = numpy.diff(x)
monotomic_regions = []
monotomic_signs = [numpy.sign(diffx[0])]
current_region = [x[0], x[1]]
for i in range(1, len(diffx)):
if numpy.sign(diffx[i]) == monotomic_signs[-1]:
current_region.append(x[i + 1])
else:
monotomic_regions.append(current_region)
monotomic_signs.append(numpy.sign(diffx[i]))
current_region = [x[i + 1]]
monotomic_regions.append(current_region)
overlapping_regions_sign = -1 if x[0] < x[-1] else 1
overlapping_regions_x = [
monotomic_regions[i]
for i in range(len(monotomic_regions))
if monotomic_signs[i] == overlapping_regions_sign
]
overlapping_regions = [
(min(region), max(region)) for region in overlapping_regions_x
]
return overlapping_regions
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not efficient. Firstly, we do not want to obtain every value in the region of overlap, just the boundaries. Secondly, we should avoid using for loops when dealing with numpy data structures as much as we can. The numpy in-built functions are run in C (much faster).

See my comment in #248 for how we should implement this overlapping region search.

Below is an implementation w/o using for loops. Try this one out.

# Given a grid x
diffx = np.diff(x)
signx = np.sign(diffx)

# Create array that is 1 when signx is -1 and 0 otherwise. Padding added.
signx_negative = np.concatenate(([0], (signx==-1).astype(int), [0]))

# Left sides of our intervals
indices_left = np.where(np.diff(signx_negative)==-1)[0]

# Right sides of our intervals
indices_right = np.where(np.diff(signx_negative)==1)[0]

# Get x grid values corresponding to the interval indices
x_vals_left = x[indices_left]
x_vals_right = x[indices_right]

# Get the intervals
overlap_intervals = list(zip(x_vals_left, x_vals_right))

@Sparks29032
Copy link
Collaborator

@ycexiao See my in-line comment about implementing the overlap region search. Also, I would prefer if we had tests for your three helper functions.

@Sparks29032
Copy link
Collaborator

without --check-increase image moroh_x in the file ranges from 0 to 10. So in [-16, 0] the x and squeezed x are overlapped.

Try smaller numbers than 0,9,1. Like 0.1,0.1,0.1 to start. Quoting directly from my previous comment:

Take a current squeeze morph test (e.g. the one Luis wrote with the sine function and small morphs on the order of 1e-3). Then, run the same squeeze morph with initial conditions say 0.1, 0.1, 0.1. When the --check-increase function is enabled, it should fail. Take off the --check-increase function and see if we converge to the same value.

@ycexiao
Copy link
Collaborator Author

ycexiao commented Sep 18, 2025

When the --check-increase function is enabled, it should fail. Take off the --check-increase function and see if we converge to the same value.

The turn-on and off behavior has already been tested by the existing tests. When --check-increase is provided, it will throw an error. When it is not by default, none of the existing tests went wrong.

The tests I left here are whether or not the numerical change, like the change of derivatives, brought by sorting the non-strictly-increasing x, is acceptable. We don't have a non-strictly-increasing test case before, since it will just raise an error. So I don't have a sense of "ground truth" to compare the current outcome with. But all the existing tests passed, maybe it means that the change is acceptable because the program is able to refine the results using the sorted x_squeeze?

@Sparks29032
Copy link
Collaborator

@ycexiao This is not tested by existing tests. Below is what we want:

Run squeeze with 0,0,0. Converges to 0.01,0.01,0.01.

Run squeeze with 0.1,0.1,0.1. Gives error.

Turn off check increasing. Run with 0.1,0.1,0.1. Converges to 0.01,0.01,0.01.

@ycexiao
Copy link
Collaborator Author

ycexiao commented Sep 30, 2025

@Sparks29032 @sbillinge, the current status of this PR:

This PR lacks convergence tests for non-strictly-increasing-x. I got stuck on designing the test case for it.

We need a carefully designed test case. The coefficients should be small enough to let all zero guess converge to the non-stricly-increasing coefficients and large enough to avoid extrapolation, as we don't have a convenient way to get extrapolation boundaries from morph_arrays, and errors in the extrapolation part are huge.

The extrapolation information is only contained in the printed warning messages. If we can let morph_arrays return a Python object containing this information, we can implement this more easily. But I think it is kind of a big change to the API, and we can have more discussion about what we want that function to return, so this idea is not implemented in this PR.

@sbillinge
Copy link
Contributor

@Sparks29032 do we really need this functionality? Is there a UC where someone needs this to error? Maybe we just move this functionality to a (clean) issue on backburner and close this PR and #256?

Sometimes it is better to wait for users to request things before we implement them. Though if we already have a need for some reason we can push on with this.

@sbillinge
Copy link
Contributor

@ycexiao @Sparks29032 can this be closed or backburnered (per the discussion)?

@ycexiao
Copy link
Collaborator Author

ycexiao commented Oct 7, 2025

@sbillinge I think it can be closed. We can implement this feature when the API becomes more flexible or our users ask us to do so.

@sbillinge sbillinge added this to the backburner milestone Oct 7, 2025
@Sparks29032
Copy link
Collaborator

@sbillinge @ycexiao Sorry I missed this thread. After thinking a bit more, I feel the default (safe) feature should be to throw the error (with our custom error message, which is not yet implemented, is stuck in limbo in this pr). Then the user can request to ignore the increasing with an option (with our warning). Since we haven't fleshed the math out for convergence / error bounds, this may be the safest approach?

@sbillinge
Copy link
Contributor

@Sparks29032 I am not sure the best approach. IT really depends how common this is and how harmful. When you make software harder to use it can frustrate people so if someone is just trying to morph their data to look at something at the beamline and it keeps not working that is bad. If we are saving them from something bad, that is fine, but if it is some trivital thing that just doesn't really matter, that is less good.

  1. Is this something that only happens once every 50 years, or is it happening like 30% of the time?
  2. If we shuffle the points and don't blow up the run, I still don't have a very good feeling about what is happening then, that seems a bit odd, is it modifying the data in some rather bad and irreversible way or is it just kind of aesthetics?

Of course merphing itself is modifying data, so it is kind of "photoshop" for data so users are already venturing into dangerous territory.

Do we have thoughts about 1. and 2.?

@Sparks29032
Copy link
Collaborator

  1. I haven't tested on enough data to know the behavior, but the shuffling (non-monotonicity) is likely to occur (>50%) when you increase beyond an order 3 polynomial (higher when your polynomial order is even since those are non-monotonic functions) and is extremely likely to occur when you have large deviations between your reference and objective grids. Following the instructions given by the error (which we can also include for the warning) mitigates the risk.
  2. The data is modified in a non-bijective way only if the function ends up mapping two grid points x_i,x_j to the same value, which is very unlikely (measure 0 chance for generic function).

@sbillinge
Copy link
Contributor

To converge this convo, we need a plan/strategy for what we will do. The possibilities are:

  1. close this and fugedaboudit
  2. move to backburner and implement later if needed
  3. work to finalize and merge it.

What is the current situation if we do 1. or 2. Is the current code in a (a) precarious situation or (b) are we comfortable with it?

If (b) let's just backburner this (probably better than option 1.) If (a) we could either do option 3., or we could do some kind of lightweight fix that removes problems but doesn't fix everything on the PR. Let's call that option 4.

Which option do you'all vote for?

@Sparks29032
Copy link
Collaborator

@sbillinge @ycexiao Since we warn the user of regions where the function is unstable post-morph, let's try option 3?
If it is so difficult to come up with such a test case, this is indicating that we don't have to worry too hard about those cases, since they are unlikely to occur anyway. Let's fix the merge conflicts and ship it? This PR is a strict monotonic increase of our current code's functionality, so I would rather have it than not.

@ycexiao
Copy link
Collaborator Author

ycexiao commented Oct 12, 2025

@Sparks29032 @sbillinge, it's ready for review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants