Mle #299

marjanfamili · 2025-02-13T15:32:41Z

In this PR I have so far has implemented the maximum_likelihood function, the outer loop in this figure shows how we hope the output of this function can be used in Active-learning in AutoEmulate

…lemented

…t is the best way of extracting the kernel

review-notebook-app · 2025-02-13T15:32:47Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

marjanfamili · 2025-02-13T15:36:19Z

One issue with this implementation is how the kernel is calculated. currently the kernel is calculated between the output means from the GP.

Could this be done more efficiently by extracting the Kernel from the PyTorch model ?
Are all our models going to be PyTorch ?

cisprague

Overall, this PR is well-structured and demonstrates the MLE functionality well — nice job! I’ve left some suggestions and questions in the review.

In summary:

Clarifying input shapes: Adding explicit shape checks and improving docstring descriptions to ensure consistency and avoid unexpected behavior.
Ensuring consistency between torch and numpy operations: This includes maintaining a clear distinction between frameworks to avoid breaking the computational graph, especially when using automatic differentiation.

cisprague · 2025-02-17T16:40:08Z

autoemulate/mle.py

+from utils import select_kernel
+
+
+def negative_log_likelihood(Sigma, obs_mean, model_mean):


Mathematically, this looks good! Some potential improvements:

The log probability is implemented in scipy and torch for multivariate normal distributions. Using those might be preferable.

Right now, the function assumes a single mean and covariance. Should we extend it to handle a batch of means with shape (n, d) and covariances of shape (n, d, d)? Also, adding shape checks might help downstream errors.

cisprague · 2025-02-17T16:41:04Z

autoemulate/mle.py

+        float: Negative log-likelihood value.
+    """
+    # Add numerical stability to Sigma
+    noise_term = 1e-5 * np.eye(Sigma.shape[0])


If we allow batched inputs, we can do Sigma.shape[-1] (last dimension) here, which will always be d for a (n, d, d) array or (d, d) array.

cisprague · 2025-02-17T16:44:19Z

autoemulate/mle.py

+    # Negative log-likelihood
+    nll = 0.5 * (quad_term + log_det_Sigma + len(diff) * np.log(2 * np.pi))
+
+    return nll.item()


If we were using torch here, calling item() stops the computation graph. To preserve differentiability, we can return the (torch) tensor. If this is what we want, then the rest of the function needs to be converted to torch.

cisprague · 2025-02-17T16:45:42Z

autoemulate/mle.py

+    """
+    Maximize the likelihood by optimizing the model parameters to fit the observed data.
+
+    Args:


Need to update the docstrings.

cisprague · 2025-02-18T08:22:30Z

autoemulate/mle.py

+        expectations (tuple): A tuple containing two elements:
+            - pred_mean (Tensor): The predicted mean values (could be 1D or 2D tensor).
+            - pred_var (Tensor): The predicted variance values (could be 1D or 2D tensor).
+        obs (list or tuple): A list or tuple containing:


Should we split this, so that the required arguments become obs_means of shape (n, d) or shape (d,), and obs_covs of shape (n, d, d) or (d, d), where both are torch arrays? If so, then we'd also need to check that both are either given as a batch with n-size leading axis, or as a single example.

cisprague · 2025-02-18T09:57:59Z

autoemulate/mle.py

+        mean = model.predict(optimizable_params.detach().numpy())
+        K = kernel(mean.reshape(-1, 1))  # X is the input data
+        nll = torch.tensor(
+            negative_log_likelihood(K, obs_mean, mean),


What is obs_mean supposed to be here? A batch of means (with shape (n, d) or a single mean (with shape (d,))?

cisprague · 2025-02-18T10:09:22Z

autoemulate/mle.py

+    return nll.item()
+
+
+def max_likelihood(parameters, model, obs, lr=0.01, epochs=1000, kernel_name="RBF"):


Just to clarify my understanding, is the following correct?

Inputs and Outputs:

parameters are the candidate inputs to the simulation model.

The simulation outputs are summarized by their mean (obs_mean) and variance (obs_var). For a deterministic simulation, you’d typically have only obs_mean.

Emulator Predictions:

The emulator (or surrogate model) takes the parameters as inputs and returns a predicted mean (pred_mean).

Instead of relying on a separately predicted variance (pred_var), the kernel is used within the negative log-likelihood (NLL) computation to generate the corresponding covariance.

NLL Computation and Optimization:

The NLL is computed for each candidate by comparing the observed outputs (obs_mean) against the predictive density defined by pred_mean and the kernel-derived covariance.

The candidate input with the lowest NLL (i.e., the one that is most plausible under the emulator’s predictive distribution) is selected.

That selected candidate is then further refined using gradient descent to minimize the NLL, effectively fine-tuning the input so that the emulator's output distribution aligns closely with the observation.

cisprague · 2025-02-18T10:14:38Z

my_tests/test_01_start.py

+
+result = max_likelihood(parameters=X, model=best_model, obs=[0, 10])
+
+print(f"Indices of plausible regions: {result['optimized_params']}")


Is "indicies" correct? Shouldn't it be a point in the input space of the simulation function?

cisprague · 2025-02-18T10:16:43Z

my_tests/test_01_start.py

+
+gp_final = em.refit(gp)
+
+seed = 42


It looks like the seed is set multiple times. What do you think about collecting these global operations into one section?

cisprague · 2025-02-18T10:20:16Z

my_tests/test_01_start.py

@@ -0,0 +1,127 @@
+import matplotlib.pyplot as plt


Nice job on demonstrating the usage. For clarity and modularity, what do you think about encapsulating each demonstration into its own function, e.g. run_epidemic_experiment()?

marjanfamili added 2 commits February 13, 2025 15:05

The MLE function currently working for multi and single output is imp…

f856d8e

…lemented

adding select_kernel funciton to utils. This should be discussed, wha…

b20e62f

…t is the best way of extracting the kernel

marjanfamili requested review from cisprague, radka-j and mastoffel February 13, 2025 15:40

cisprague requested changes Feb 18, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mle #299

Mle #299

marjanfamili commented Feb 13, 2025 •

edited

Loading

review-notebook-app bot commented Feb 13, 2025

marjanfamili commented Feb 13, 2025

cisprague left a comment

cisprague Feb 17, 2025

cisprague Feb 17, 2025

cisprague Feb 17, 2025

cisprague Feb 17, 2025

cisprague Feb 18, 2025

cisprague Feb 18, 2025

cisprague Feb 18, 2025

cisprague Feb 18, 2025

cisprague Feb 18, 2025

cisprague Feb 18, 2025

		from utils import select_kernel


		def negative_log_likelihood(Sigma, obs_mean, model_mean):

		return nll.item()


		def max_likelihood(parameters, model, obs, lr=0.01, epochs=1000, kernel_name="RBF"):


		result = max_likelihood(parameters=X, model=best_model, obs=[0, 10])

		print(f"Indices of plausible regions: {result['optimized_params']}")

Mle #299

Are you sure you want to change the base?

Mle #299

Conversation

marjanfamili commented Feb 13, 2025 • edited Loading

review-notebook-app bot commented Feb 13, 2025

marjanfamili commented Feb 13, 2025

cisprague left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marjanfamili commented Feb 13, 2025 •

edited

Loading