Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/corr matrix and inverse cov matrix as input in least squares function for correlated fits #223

Merged

Conversation

PiaLJP
Copy link
Contributor

@PiaLJP PiaLJP commented Jan 10, 2024

Added kwargs corr_matrix and inv_chol_cov_marix to the least_squares function.
Note: For the kwargs to work correlated_fit=True needs to be set too.
(1) kwarg corr_matrix
Now it is possible to hand over a correlation matrix of your own choosing for a correlated fit.
In the least_squares function a Cholesky decomposition is then applied to the correlation matrix
and the resulting lower triangular matrix is then combined with the y errors and inverted such that
a lower triangular matrix corresponding to the inverse covariance matrix is extracted (and later used
for the definition of the correlated chi squared function).
(2) kwarg inv_chol_cov_marix
It is also possible to hand over the inverse covariance matrix directly (as a lower triangular matrix)
if it has been constructed from a Cholesky decomposition of the correlation matrix in the same way
as detailed in (1).

@PiaLJP PiaLJP requested a review from fjosw as a code owner January 10, 2024 17:52
@fjosw fjosw requested a review from s-kuberski January 11, 2024 07:55
@s-kuberski
Copy link
Collaborator

Hi Pia,
thanks a lot for adding this feature, I think it will be quite useful. I'll take some time to test your implementation before I provide further feedback.

Copy link
Collaborator

@s-kuberski s-kuberski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,
I have written a lot of text, not because I don't like your code but because I think we need a little bit of discussion concerning the capabilities of the routine with respect to the new parameters. Thanks again for starting this!

pyerrors/fits.py Outdated Show resolved Hide resolved
pyerrors/fits.py Outdated Show resolved Hide resolved
pyerrors/fits.py Outdated Show resolved Hide resolved
pyerrors/fits.py Outdated Show resolved Hide resolved
pyerrors/fits.py Outdated Show resolved Hide resolved
pyerrors/fits.py Outdated Show resolved Hide resolved
tests/fits_test.py Show resolved Hide resolved
@s-kuberski
Copy link
Collaborator

Hi,
thanks for your explanation. I started these distributed comments but I'll try to condense everything to a single message now...
Based on what I understand from our discussion, we would provide the user with two possibilities:

  • Passing a covariance matrix that is assumed to be a proper estimator of the true covariance matrix. This would allow the user to use a different definition than the one that is used inside the current routine or to use some sort of shrinkage algorithm which would also deliver a proper estimator.
  • Passing an inverted covariance matrix (in proper shape for the fit). This would allow to use modified weights which, together with the expected chisquare, could still be useful for fits.

I think that this is a good solution (I'll try to think more if I can come up with cases that are not covered).

I have a few proposals for this strategy:

  • The correlation matrix that is provided by the user should then also be used in the expected chisquare, because we assume that it is a good estimator for the true correlation matrix.
  • To enable the user to provide the covariance matrix in the proper form, we could think of putting everything in lines

    pyerrors/pyerrors/fits.py

    Lines 300 to 308 in df1873b

    corr = covariance(y_all, correlation=True, **kwargs)
    covdiag = np.diag(1 / np.asarray(dy_f))
    condn = np.linalg.cond(corr)
    if condn > 0.1 / np.finfo(float).eps:
    raise Exception(f"Cannot invert correlation matrix as its condition number exceeds machine precision ({condn:1.2e})")
    if condn > 1e13:
    warnings.warn("Correlation matrix may be ill-conditioned, condition number: {%1.2e}" % (condn), RuntimeWarning)
    chol = np.linalg.cholesky(corr)
    chol_inv = scipy.linalg.solve_triangular(chol, covdiag, lower=True)
    into a separate function, e.g., def invert_corr_cov_cholesky(corr, covdiag). This could then be called inside the fit routine but also from outside such that a user can first come up with a correlation matrix of their own choosing and then invert it before passing it to the fit routine. We would have to explain to the user how corr and covdiag are defined. In principle it could also be a possibility to have def invert_cov_cholesky(cov) where the weights are computed inside the routine. (Is it clear what I mean?)
  • I think all checks regarding positive definiteness and the condition number would still be in place if the user used the (newly provided) function to invert the covariance matrix of their choosing.
  • It would certainly be very useful to demonstrate all of the possibilities in the examples and to write very accurate doc strings.

@PiaLJP
Copy link
Contributor Author

PiaLJP commented Jan 19, 2024

Hi Simon,
thanks for the summary and suggestions! I basically agree with all your points I just have a few additional comments.

Passing a covariance matrix that is assumed to be a proper estimator of the true covariance matrix. This would allow the user to use a different definition than the one that is used inside the current routine or to use some sort of shrinkage algorithm which would also deliver a proper estimator.

Just to be sure you meant to say correlation matrix? So you are happy with kwargs options corr_matrix and inv_chol_cov_matrix as they are now?

I thinks it's a good idea to define def invert_corr_cov_cholesky(corr, covdiag) outside theleast_squaresfunction. But I have noticed that is not necessary to construct the inverse covariance matrix in exactly that way. The inverse covariance matrix (kwarg inv_chol_cov) would only have to be a lower triangular matrix from a Cholesky decomposition whether the decomposition is carried out only after the covariance matrix has been inverted does not matter. (The result can be slightly different of course if the condition number is not ideal so this might be a reason to favour the way the inverse covariance is calculated in the code at the moment since the 'size' (fewer non-zero elements) of the matrix is reduced early on...). So the kwarg is more flexible than I thought originally and a Cholesky decomposition always exists (if the matrix is symmetric and positive definite which it should be anyway). Then def invert_corr_cov_cholesky(corr, covdiag) would not have to be used of course so

I think all checks regarding positive definiteness and the condition number would still be in place if the user used the (newly provided) function to invert the covariance matrix of their choosing.

would not hold in the way you say. The Cholesky decomposition only works if positive definiteness holds but the condition number could still be large. I would still like to allow this so that if you are provided with an inverse covariance matrix (not unusual for cross-checking results) from someone else for instance you can still use it. We could just add in the description that the condition number should not be too large as you can also see from the function
def invert_corr_cov_cholesky(corr, covdiag)

In principle it could also be a possibility to have def invert_cov_cholesky(cov) where the weights are computed inside the routine. (Is it clear what I mean?)

Here you're referring to modified weights for single data points? So you would like to be able to hand over a list/dict of weights to this function?

The correlation matrix that is provided by the user should then also be used in the expected chisquare, because we assume that it is a good estimator for the true correlation matrix.

Ok so this would be the same kwarg corr_matrix or the kwarg inv_chol_cov_matrix but the expected chisquare would of course only be calculated if the the kwarg expected_chisquare is set in addition (so in an uncorrelated fit). Is that what you had in mind?

@s-kuberski
Copy link
Collaborator

s-kuberski commented Jan 26, 2024

Hi,
sorry for the delay...

Just to be sure you meant to say correlation matrix? So you are happy with kwargs options corr_matrix and inv_chol_cov_matrix as they are now?
Yes, sorry, that should have been "correlation matrix".

I didn't fully get your next paragraph. Do you mean that one could also provide an inverted correlation matrix and then, in the fit routine, do the cholesky decomposition? I think the two steps do not necessarily commute but, as you write, this is probably more of an issue of the condition number is bad.

If we would provide the routine to invert and decompose a matrix, which then can be passed to the fit routine, we would remain fully flexible, right? People could come up with a matrix of their choosing and use pyerrors to Cholesky-decompose and invert, before passing it to the fit routine. In this case, all of the checks would still be in place (you could allow to explicitly turn errors into warnings, if you'd like). The other case would be to use an inverse matrix from somewhere else (e.g., to crosscheck someone other's work). In that case (I would think that it is an unusual one), the user would have to do the Cholesky decomposition themselves, before passing the matrix to the code.

Here you're referring to modified weights for single data points? So you would like to be able to hand over a list/dict of weights to this function?

I was wondering how to allow for modified weights for single data point, exactly. This would not go into the correlation matrix but into the "error vector". I am still a bit confused by all the possibilities... If we do not think about the expected chi^2, it would be fine to just be able to pass a correlation matrix, the "error vector" (something related to covdiag in the current implementation) or an inverted correlation matrix. As soon as one takes into account the expected chi^2, it matters if these matrices and weights are considered to be "better estimators for the true covariance matrix" or "modified covariance matrices" which can be turned to be useful with the expected chi^2.

@fjosw
Copy link
Owner

fjosw commented May 24, 2024

Hey @PiaLJP & @s-kuberski, any updates on this PR? Do you still want to proceed of should we close it for now?

@s-kuberski
Copy link
Collaborator

Hi,
we discussed a bit offline to be more efficient. As I understood, @PiaLJP will get back to implementing the changes that we discussed in the near future.

PiaLJP and others added 7 commits June 25, 2024 10:54
…utsourced the inversion & cholesky decomposition of the covariance matrix (function 'invert_corr_cov_cholesky(corr, covdiag)')
… it) to sort correlation matrix according to a list of alphabetically sorted keys
…d typos in documentation of invert_corr_cov_cholesky()
@s-kuberski
Copy link
Collaborator

Thanks a lot for the last changes. From my side, this can be merged. If you are fine with this @fjosw, I'll go ahead.

Copy link
Owner

@fjosw fjosw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me 👍

@fjosw fjosw merged commit 1d6f7f6 into fjosw:develop Sep 13, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants