-
Notifications
You must be signed in to change notification settings - Fork 37
Kernel Derivatives #46
Comments
Sounds great! How can I help? |
Hello! Very early on there was an attempt at adding derivatives - that's the I've since reworked much of the package and explored how other libraries approach derivatives. Rather than having the I'm almost done the changes I've outlined in the "Optimization" section. Unfortunately I need to finish that first since the derivatives have a few dependencies on those changes. Once that is complete, it will just be a matter of defining analytic derivatives for the parameters and a kernel/kernel matrix derivative. I can provide some more direction as soon as that done if you'd like to help. It will be a couple more days though |
Excellent! I would like to help with defining the analytical derivatives. It seems that some of them have already been done in the Should #2 be closed? |
The optimization section is basically complete save for a few tests - so it's good enough to start on the derivatives. I've updated the original comment for some detail. I've also expanded the documentation here: http://mlkernels.readthedocs.io/en/dev/interface.html The Hyper Parameters section may be helpful. If you'd like to add some derivative definitions and open a PR, feel free. You can probably grab a number of them from the |
There's two components to this enhancement.
Optimization
Define a
theta
andeta
(inversetheta
) function to transform parameters between an open bounded interval to a closed bounded interval (or eliminate the bounds entirely) for use in optimization methods. This is similar to how link functions work in logistic regression - unconstrained optimization is used to set a parameter value in the interval (0,1) using the logit link function.theta
- given an interval and a value, applies a transformation that eliminates finite open boundseta
- given an interval and a value, reverses the value back to the original parameter spacegettheta
returns the theta transformed variable when applied toHyperParameters
and a vector of theta transformed variables when used on aKernel
settheta!
this function is used to updateHyperParameter
s orKernel
s given a vector of theta-transformed variableschecktheta
used to check if the provided vector (or scalar if working with a HyperParameter) is a valid updateupperboundtheta
returns the theta-transformed upper bound. For example, in the case that a parameter is restricted to (0,1], the transformed upper bound will be log(1)lowerboundtheta
returns the theta-transformed lower bound. For example, in the case that a parameter is restricted to (0,1], the transformed lower bound will be -InfinityDerivatives
Derivatives will be with respect to
theta
as described above.gradeta
derivative ofeta
function. Using chain rule, this is applied togradkappa
to get the derivative with respect to theta. Not exported.gradkappa
derivative of the scalar part of aKernel
. This must be defined for each kernel. It will be manual, so the derivative will be analytical or a hand coded numerical derivative. It will only be defined for parameters of the kernel. Not exported. Ex.dkappa(k, Val{:alpha}, z)
gradkernel
derivative ofkernel
. Second argument will be the variable the derivative is with respect to. A value type with the field name as a parameter will be used. Ex.dkernel(k, Val{:alpha}, x, y)
gradkernelmatrix
derivative matrix.The text was updated successfully, but these errors were encountered: