[Feat] Meas class for gaussian error propagation. #245

fjosw · 2024-11-17T11:28:52Z

I sometimes want to do Gaussian error propagation and find the syntax for creating a scalar cov_Obs a bit inconvenient. For this reason I propose a Meas class that simplifies this.

import pyerrors as pe
value = 1.1
standard_error = 0.23
meas = pe.Meas(value, standard_error)
covo = pe.cov_Obs(value, standard_error ** 2, meas.names[0])
assert covo == meas

Standard errors do not need to be squared and if no name is provided, a uuid is generated instead. I also propose to automatically evaluate the gamma method when an Obs exclusively consists of cov_Obs / Meas as the error estimation should be unambigous and the performance oberhead should be minimal.

In the process I also found a small bug in the cov_Obs implementation.

jkuhl-uni · 2024-11-18T09:37:12Z

pyerrors/obs.py

+            raise TypeError(f"dvalue has to be a float or int, not {type(dvalue)}")
+        super().__init__(samples=[], names=[], means=[])
+        if name is None:
+            name = uuid.uuid4().hex


Hey @fjosw,
the class looks good to me. Thanks for the idea and addition. I just wanted to ask if this is something that we want. I do not mind it, and if there is a specific use case for you, we can, of course, leave it in.

s-kuberski · 2024-11-18T16:40:50Z

Hi,

thanks for the suggestion! What was the bug in the existing implementation? The missing 'float()'?

In general, I think that it is handy to have a wrapper that makes ones life easier. However, I ask myself if assigning automatically chosen random names to observables is something that one should do. I know that we already do it for priors in fits and there it would be very hard to circumvent it. Here, there could be the possibility that people introduce their external observable with the new class and destroy their error propagation because they add the corresponding uncertainty multiple times. What exactly is the use case you were aiming at?

It is certainly a very good idea to perform the error propagation automatically if no MC data is present!

fjosw · 2024-11-18T18:09:33Z

Hi @jkuhl-uni & @s-kuberski ,

What was the bug in the existing implementation? The missing 'float()'?

Yes, we currently compute the gradients for __pow__ via autograd which cannot cope with int, so I had to add the float cast to make it work. We could actually provide the gradients for pow directly which should be faster. I can prepare a PR for that.

if this is something that we want
I ask myself if assigning automatically chosen random names to observables is something that one should do.

My underlying assumption for the random uuid is that, unlike Monte Carlo data, constants with errors are usually statistically independent of all other sources of error. If that is the case it might be easier to automatically assign a random string instead of having the user come up with unique identifiers. The option to assign a dedicated name of course still exists.

By adding the Meas class we could slightly increase the scope of pyerrors as a tool for Gaussian error propagation which feels a bit cumbersome with the current UI. I was considering adding another example to detail this feature but if you feel like it is too dangerous to blindly assign a random name, I'm happy to remove it or change the default.

s-kuberski · 2024-11-19T17:06:25Z

We could actually provide the gradients for pow directly which should be faster. I can prepare a PR for that.

That sounds good!

By adding the Meas class we could slightly increase the scope of pyerrors as a tool for Gaussian error propagation which feels a bit cumbersome with the current UI.

I agree that the current UI is not perfectly suited for quick error propagation, because I chose to ask for the covariance and because we require a name. If one would like to allow users to easily perform similar tasks as they would do in the 'uncertainties' package, the proposed change would be the perfect solution, I think.

My underlying assumption for the random uuid is that, unlike Monte Carlo data, constants with errors are usually statistically independent of all other sources of error. If that is the case it might be easier to automatically assign a random string instead of having the user come up with unique identifiers. The option to assign a dedicated name of course still exists.

As soon as someone runs a script twice and saves the output somewhere to process it afterwards, I would think that a dedicated naming scheme is very important to ensure proper error estimation. If someone is not aware of this, it could easily happen that they input, say, the physical pion mass using the new class and end up with uncorrelated contributions from the same observable. I'd think that names really matter as soon as one does a proper analysis.

So probably it boils down to anticipating what users do with the package that would still provide all the means to do everything properly and the question if not asking for a name per default might make it easier to do mistakes. I'm not really sure about how to proceed.

fjosw · 2024-11-20T18:21:46Z

As soon as someone runs a script twice and saves the output somewhere to process it afterwards, I would think that a dedicated naming scheme is very important to ensure proper error estimation.

If your main concern is about saving and reusing objects with automatically generated names, we could simply prevent such Obs to be exported. I added an explicit check to the json module which throws a ValueError in these cases. Do you feel like this is safe enough or are there still other silent failure modes?

s-kuberski · 2024-11-26T18:58:27Z

I'm still not fully convinced, but I don't want to block it, if you'd like to have the feature in. I still fear that people create the same object twice with different uuid in their analysis, when they are not as good organized as you ;). Basically, I think that it is somewhat bad style if the final object that comes out of the analysis contains a contribution that is not clearly identifiable. It goes a bit against one of the basic ideas... So I would think that the feature would only be good for some back of the envelope error propagation, not for real world analyses.

In case we add the class, I would think that it would be good to expand the docsctring, clearly stating the restrictions of the class and pointing to the covobs class for "proper" error propagation. Also, I was thinking about the name. Why did you choose "scalar measurement"? Somehow, I would relate a measurement to something that comes out of a MC chain. I don't have a good suggestion for an alternative, though.

s-kuberski · 2024-11-27T08:58:23Z

Maybe we could make the name non-optional and still use an uuid if None or an empty string is passed? This would at least prompt users to think about names. Of course, it again adds a bit more luggage to the easy creation of these objects. At least, one would not have to come up with names, if one just wants to do a bit of error propagation where everything can be assumed to be uncorrelated.

jkuhl-uni · 2024-11-27T09:36:56Z

I also thought about runtime warnings like
Warning: No name given, using UUID.
However, I think those would be pretty ugly and not helpful since a) the class is used as intended and a warning would still be issued and b) they could vanish in too many notifications.

fjosw added 2 commits November 17, 2024 12:18

[Feat] Meas class for gaussian error propagation.

f2a98bb

[Tests] Added test to compare cov_Obs and Meas implementation.

779dedf

fjosw requested a review from s-kuberski November 17, 2024 11:28

jkuhl-uni reviewed Nov 18, 2024

View reviewed changes

[Feat] Prevent objects with uuid names to be serialized to json.

5c4f44f

Merge branch 'develop' into feat/meas_class

8c7cd5a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feat] Meas class for gaussian error propagation. #245

[Feat] Meas class for gaussian error propagation. #245

Uh oh!

fjosw commented Nov 17, 2024

Uh oh!

jkuhl-uni Nov 18, 2024

Uh oh!

s-kuberski commented Nov 18, 2024

Uh oh!

fjosw commented Nov 18, 2024

Uh oh!

s-kuberski commented Nov 19, 2024

Uh oh!

fjosw commented Nov 20, 2024

Uh oh!

s-kuberski commented Nov 26, 2024

Uh oh!

s-kuberski commented Nov 27, 2024

Uh oh!

jkuhl-uni commented Nov 27, 2024

Uh oh!

Uh oh!

[Feat] Meas class for gaussian error propagation. #245

Are you sure you want to change the base?

[Feat] Meas class for gaussian error propagation. #245

Uh oh!

Conversation

fjosw commented Nov 17, 2024

Uh oh!

jkuhl-uni Nov 18, 2024

Choose a reason for hiding this comment

Uh oh!

s-kuberski commented Nov 18, 2024

Uh oh!

fjosw commented Nov 18, 2024

Uh oh!

s-kuberski commented Nov 19, 2024

Uh oh!

fjosw commented Nov 20, 2024

Uh oh!

s-kuberski commented Nov 26, 2024

Uh oh!

s-kuberski commented Nov 27, 2024

Uh oh!

jkuhl-uni commented Nov 27, 2024

Uh oh!

Uh oh!