-
Notifications
You must be signed in to change notification settings - Fork 37
ARD Implementation and Automatic Differentation compatibility #82
base: master
Are you sure you want to change the base?
Conversation
Ah I just realised some features are not working... I will correct them and make a new PR |
3 similar comments
Hi @theogf! I want to start by saying thanks for taking an interest in my project! I appreciate your enthusiasm to contribute; I see you’ve put a lot of effort into this pull request. I do want to support ARD & make AD possible. I had hoped to release a version 1.0 to correspond to Julia 1.0 with the linear/periodic kernels fully deprecated and #65 closed before looking into adding ARD as a feature (I’m working on some documentation). I’ve created issue #83 for ARD where I’d like to discuss the approach further. I’ve put a proposal in there which I think may end up a little cleaner. |
@check_args(ExponentialKernel, α, α > zero(T), "α > 0") | ||
return new{T}(α) | ||
struct ExponentialKernel{T<:Real,A} <: AbstractExponentialKernel{T} | ||
α::A |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the A
parameter really needed or could Union{AbstractVector{T},T}
work instead? I found that having multiple parameters made things far more complex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The A
parameter has a nice property to become either a scalar or vector, and the second big advantage is that it will allow compatibility with CuArrays
and other more complex arrays if other users want to use the package. But indeed one could restrict A
to A<:Union{AbstractVector{T},T}
ν::T | ||
ρ::T | ||
function MaternKernel{T}(ν::Real=T(1), ρ::Real=T(1)) where {T<:AbstractFloat} | ||
α::A |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to keep the variable names consistent with the literature (but allow for vectors) - this was also one of the reasons for my proposal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think keeping consistency is important! But adding a vector just for the scaling could be costly. The proposal of weights make sense and actually go in the direction of the way Distances.jl
work. Is there actually a reason why you don't base your base_matrix
functions on it?
@@ -119,7 +122,7 @@ The ``\gamma``-exponential kernel is an isotropic Mercer kernel given by the for | |||
κ(x,y) = exp(α‖x-y‖²ᵞ) α > 0, γ ∈ (0,1] | |||
``` | |||
where `α` is a scaling parameter and `γ` is a shape parameter of the Euclidean distance. | |||
When `γ = 1` use [`SquaredExponentialKernel`](@ref) and [`SquaredExponentialKernel`](@ref) | |||
When `γ = 1` use [`SquaredExponentialKernel`](@ref) and [`ExponentialKernel`](@ref) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
Also added #84. Part of the strength of this package is that it usually uses BLAS, so I have to see what has to change to support that. |
Hello! I reworked a lot of things to make things compatible and realized that I was getting pretty far away from the original package (notably by relying on |
@willtebbutt @theogf So funny story there - I was going to swap out the back-end for Is this the new package? https://github.com/theogf/KernelFunctions.jl I've been wanting to participate more in the community rather than just working solo on my own package(s). Any chance I can help with your package's development? I'd like to continue as a contributor. I can also deprecate and link to yours. |
That's amazing! Of course we would be more than happy to have you on the team! I will add you as a collaborator. |
Hello,
I made relatively heavy changes to your package with the objective to be able to have Automatic Relevance Determination and to be compatible with Automatic Differentation packages such as ForwardDiff.jl.
I changed the way distances were computed by directly applying the scale factor (appropriately) to the vectors. It might make the algorithm slower, I have not made benchmarks so far.
However it is still retrocompatible and using a scalar scaling work with everything.
I did not add tests yet for the ARD case, and I had to remove some cause of the needed transformation of the parameters (in the "Testing constructors" part) cause I did not know if the best option is to keep a copy of the original parameters or ignore this issue. Let me know if you would like other modifications.
Cheers,