Question About optimising the prior variance using marginal likelihood

Could you clarify the role of 1 / t.linalg.norm(s2) * map_norms in the model evidence formula? Specifically, how does the normalization by t.linalg.norm(s2) affect the contribution of map_norms to the overall model evidence?

When maximizing the marginal likelihood, should we include a negative sign in front of the model evidence term to properly adjust for the optimization algorithm’s minimization objective? If not, could you clarify how the model evidence is being handled in the optimization process?

```python
map_norms = 0.0
lora_params = {
    k: v
    for k, v in dict(model.named_parameters()).items()
    if "lora" in k.lower() and v.requires_grad
}
for i, param in enumerate(lora_params.values()):
    map_norms += t.linalg.norm(param)
model_evidence = LL + 1 / t.linalg.norm(s2) * map_norms + 0.5 * logdet
```
    

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question About optimising the prior variance using marginal likelihood #4

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Question About optimising the prior variance using marginal likelihood #4

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions