Skip to content

Commit eb7b158

Browse files
authored
DOC clarify the kernel gradient for GaussianProcesses (scikit-learn#18115)
1 parent bdf2ff5 commit eb7b158

File tree

2 files changed

+52
-42
lines changed

2 files changed

+52
-42
lines changed

doc/modules/gaussian_process.rst

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -385,12 +385,14 @@ equivalent call to ``__call__``: ``np.diag(k(X, X)) == k.diag(X)``
385385
Kernels are parameterized by a vector :math:`\theta` of hyperparameters. These
386386
hyperparameters can for instance control length-scales or periodicity of a
387387
kernel (see below). All kernels support computing analytic gradients
388-
of the kernel's auto-covariance with respect to :math:`\theta` via setting
389-
``eval_gradient=True`` in the ``__call__`` method. This gradient is used by the
390-
Gaussian process (both regressor and classifier) in computing the gradient
391-
of the log-marginal-likelihood, which in turn is used to determine the
392-
value of :math:`\theta`, which maximizes the log-marginal-likelihood, via
393-
gradient ascent. For each hyperparameter, the initial value and the
388+
of the kernel's auto-covariance with respect to :math:`log(\theta)` via setting
389+
``eval_gradient=True`` in the ``__call__`` method.
390+
That is, a ``(len(X), len(X), len(theta))`` array is returned where the entry
391+
``[i, j, l]`` contains :math:`\frac{\partial k_\theta(x_i, x_j)}{\partial log(\theta_l)}`.
392+
This gradient is used by the Gaussian process (both regressor and classifier)
393+
in computing the gradient of the log-marginal-likelihood, which in turn is used
394+
to determine the value of :math:`\theta`, which maximizes the log-marginal-likelihood,
395+
via gradient ascent. For each hyperparameter, the initial value and the
394396
bounds need to be specified when creating an instance of the kernel. The
395397
current value of :math:`\theta` can be get and set via the property
396398
``theta`` of the kernel object. Moreover, the bounds of the hyperparameters can be

sklearn/gaussian_process/kernels.py

Lines changed: 44 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -572,8 +572,8 @@ def __call__(self, X, Y=None, eval_gradient=False):
572572
is evaluated instead.
573573
574574
eval_gradient : bool, default=False
575-
Determines whether the gradient with respect to the kernel
576-
hyperparameter is determined.
575+
Determines whether the gradient with respect to the log of the
576+
kernel hyperparameter is computed.
577577
578578
Returns
579579
-------
@@ -582,7 +582,7 @@ def __call__(self, X, Y=None, eval_gradient=False):
582582
583583
K_gradient : ndarray of shape \
584584
(n_samples_X, n_samples_X, n_dims, n_kernels), optional
585-
The gradient of the kernel k(X, X) with respect to the
585+
The gradient of the kernel k(X, X) with respect to the log of the
586586
hyperparameter of the kernel. Only returned when `eval_gradient`
587587
is True.
588588
"""
@@ -796,8 +796,8 @@ def __call__(self, X, Y=None, eval_gradient=False):
796796
is evaluated instead.
797797
798798
eval_gradient : bool, default=False
799-
Determines whether the gradient with respect to the kernel
800-
hyperparameter is determined.
799+
Determines whether the gradient with respect to the log of
800+
the kernel hyperparameter is computed.
801801
802802
Returns
803803
-------
@@ -806,7 +806,7 @@ def __call__(self, X, Y=None, eval_gradient=False):
806806
807807
K_gradient : ndarray of shape (n_samples_X, n_samples_X, n_dims),\
808808
optional
809-
The gradient of the kernel k(X, X) with respect to the
809+
The gradient of the kernel k(X, X) with respect to the log of the
810810
hyperparameter of the kernel. Only returned when `eval_gradient`
811811
is True.
812812
"""
@@ -894,8 +894,8 @@ def __call__(self, X, Y=None, eval_gradient=False):
894894
is evaluated instead.
895895
896896
eval_gradient : bool, default=False
897-
Determines whether the gradient with respect to the kernel
898-
hyperparameter is determined.
897+
Determines whether the gradient with respect to the log of
898+
the kernel hyperparameter is computed.
899899
900900
Returns
901901
-------
@@ -904,7 +904,7 @@ def __call__(self, X, Y=None, eval_gradient=False):
904904
905905
K_gradient : ndarray of shape (n_samples_X, n_samples_X, n_dims), \
906906
optional
907-
The gradient of the kernel k(X, X) with respect to the
907+
The gradient of the kernel k(X, X) with respect to the log of the
908908
hyperparameter of the kernel. Only returned when `eval_gradient`
909909
is True.
910910
"""
@@ -1072,8 +1072,8 @@ def __call__(self, X, Y=None, eval_gradient=False):
10721072
is evaluated instead.
10731073
10741074
eval_gradient : bool, default=False
1075-
Determines whether the gradient with respect to the kernel
1076-
hyperparameter is determined.
1075+
Determines whether the gradient with respect to the log of
1076+
the kernel hyperparameter is computed.
10771077
10781078
Returns
10791079
-------
@@ -1082,7 +1082,7 @@ def __call__(self, X, Y=None, eval_gradient=False):
10821082
10831083
K_gradient : ndarray of shape (n_samples_X, n_samples_X, n_dims),\
10841084
optional
1085-
The gradient of the kernel k(X, X) with respect to the
1085+
The gradient of the kernel k(X, X) with respect to the log of the
10861086
hyperparameter of the kernel. Only returned when `eval_gradient`
10871087
is True.
10881088
"""
@@ -1200,8 +1200,9 @@ def __call__(self, X, Y=None, eval_gradient=False):
12001200
is evaluated instead.
12011201
12021202
eval_gradient : bool, default=False
1203-
Determines whether the gradient with respect to the kernel
1204-
hyperparameter is determined. Only supported when Y is None.
1203+
Determines whether the gradient with respect to the log of
1204+
the kernel hyperparameter is computed.
1205+
Only supported when Y is None.
12051206
12061207
Returns
12071208
-------
@@ -1210,7 +1211,7 @@ def __call__(self, X, Y=None, eval_gradient=False):
12101211
12111212
K_gradient : ndarray of shape (n_samples_X, n_samples_X, n_dims), \
12121213
optional
1213-
The gradient of the kernel k(X, X) with respect to the
1214+
The gradient of the kernel k(X, X) with respect to the log of the
12141215
hyperparameter of the kernel. Only returned when eval_gradient
12151216
is True.
12161217
"""
@@ -1319,8 +1320,9 @@ def __call__(self, X, Y=None, eval_gradient=False):
13191320
is evaluated instead.
13201321
13211322
eval_gradient : bool, default=False
1322-
Determines whether the gradient with respect to the kernel
1323-
hyperparameter is determined. Only supported when Y is None.
1323+
Determines whether the gradient with respect to the log of
1324+
the kernel hyperparameter is computed.
1325+
Only supported when Y is None.
13241326
13251327
Returns
13261328
-------
@@ -1329,7 +1331,7 @@ def __call__(self, X, Y=None, eval_gradient=False):
13291331
13301332
K_gradient : ndarray of shape (n_samples_X, n_samples_X, n_dims),\
13311333
optional
1332-
The gradient of the kernel k(X, X) with respect to the
1334+
The gradient of the kernel k(X, X) with respect to the log of the
13331335
hyperparameter of the kernel. Only returned when eval_gradient
13341336
is True.
13351337
"""
@@ -1466,8 +1468,9 @@ def __call__(self, X, Y=None, eval_gradient=False):
14661468
if evaluated instead.
14671469
14681470
eval_gradient : bool, default=False
1469-
Determines whether the gradient with respect to the kernel
1470-
hyperparameter is determined. Only supported when Y is None.
1471+
Determines whether the gradient with respect to the log of
1472+
the kernel hyperparameter is computed.
1473+
Only supported when Y is None.
14711474
14721475
Returns
14731476
-------
@@ -1476,7 +1479,7 @@ def __call__(self, X, Y=None, eval_gradient=False):
14761479
14771480
K_gradient : ndarray of shape (n_samples_X, n_samples_X, n_dims), \
14781481
optional
1479-
The gradient of the kernel k(X, X) with respect to the
1482+
The gradient of the kernel k(X, X) with respect to the log of the
14801483
hyperparameter of the kernel. Only returned when `eval_gradient`
14811484
is True.
14821485
"""
@@ -1620,8 +1623,9 @@ def __call__(self, X, Y=None, eval_gradient=False):
16201623
if evaluated instead.
16211624
16221625
eval_gradient : bool, default=False
1623-
Determines whether the gradient with respect to the kernel
1624-
hyperparameter is determined. Only supported when Y is None.
1626+
Determines whether the gradient with respect to the log of
1627+
the kernel hyperparameter is computed.
1628+
Only supported when Y is None.
16251629
16261630
Returns
16271631
-------
@@ -1630,7 +1634,7 @@ def __call__(self, X, Y=None, eval_gradient=False):
16301634
16311635
K_gradient : ndarray of shape (n_samples_X, n_samples_X, n_dims), \
16321636
optional
1633-
The gradient of the kernel k(X, X) with respect to the
1637+
The gradient of the kernel k(X, X) with respect to the log of the
16341638
hyperparameter of the kernel. Only returned when `eval_gradient`
16351639
is True.
16361640
"""
@@ -1809,16 +1813,17 @@ def __call__(self, X, Y=None, eval_gradient=False):
18091813
if evaluated instead.
18101814
18111815
eval_gradient : bool, default=False
1812-
Determines whether the gradient with respect to the kernel
1813-
hyperparameter is determined. Only supported when Y is None.
1816+
Determines whether the gradient with respect to the log of
1817+
the kernel hyperparameter is computed.
1818+
Only supported when Y is None.
18141819
18151820
Returns
18161821
-------
18171822
K : ndarray of shape (n_samples_X, n_samples_Y)
18181823
Kernel k(X, Y)
18191824
18201825
K_gradient : ndarray of shape (n_samples_X, n_samples_X, n_dims)
1821-
The gradient of the kernel k(X, X) with respect to the
1826+
The gradient of the kernel k(X, X) with respect to the log of the
18221827
hyperparameter of the kernel. Only returned when eval_gradient
18231828
is True.
18241829
"""
@@ -1954,8 +1959,9 @@ def __call__(self, X, Y=None, eval_gradient=False):
19541959
if evaluated instead.
19551960
19561961
eval_gradient : bool, default=False
1957-
Determines whether the gradient with respect to the kernel
1958-
hyperparameter is determined. Only supported when Y is None.
1962+
Determines whether the gradient with respect to the log of
1963+
the kernel hyperparameter is computed.
1964+
Only supported when Y is None.
19591965
19601966
Returns
19611967
-------
@@ -1964,7 +1970,7 @@ def __call__(self, X, Y=None, eval_gradient=False):
19641970
19651971
K_gradient : ndarray of shape (n_samples_X, n_samples_X, n_dims), \
19661972
optional
1967-
The gradient of the kernel k(X, X) with respect to the
1973+
The gradient of the kernel k(X, X) with respect to the log of the
19681974
hyperparameter of the kernel. Only returned when `eval_gradient`
19691975
is True.
19701976
"""
@@ -2086,8 +2092,9 @@ def __call__(self, X, Y=None, eval_gradient=False):
20862092
if evaluated instead.
20872093
20882094
eval_gradient : bool, default=False
2089-
Determines whether the gradient with respect to the kernel
2090-
hyperparameter is determined. Only supported when Y is None.
2095+
Determines whether the gradient with respect to the log of
2096+
the kernel hyperparameter is computed.
2097+
Only supported when Y is None.
20912098
20922099
Returns
20932100
-------
@@ -2096,7 +2103,7 @@ def __call__(self, X, Y=None, eval_gradient=False):
20962103
20972104
K_gradient : ndarray of shape (n_samples_X, n_samples_X, n_dims),\
20982105
optional
2099-
The gradient of the kernel k(X, X) with respect to the
2106+
The gradient of the kernel k(X, X) with respect to the log of the
21002107
hyperparameter of the kernel. Only returned when `eval_gradient`
21012108
is True.
21022109
"""
@@ -2240,8 +2247,9 @@ def __call__(self, X, Y=None, eval_gradient=False):
22402247
if evaluated instead.
22412248
22422249
eval_gradient : bool, default=False
2243-
Determines whether the gradient with respect to the kernel
2244-
hyperparameter is determined. Only supported when Y is None.
2250+
Determines whether the gradient with respect to the log of
2251+
the kernel hyperparameter is computed.
2252+
Only supported when Y is None.
22452253
22462254
Returns
22472255
-------
@@ -2250,7 +2258,7 @@ def __call__(self, X, Y=None, eval_gradient=False):
22502258
22512259
K_gradient : ndarray of shape (n_samples_X, n_samples_X, n_dims),\
22522260
optional
2253-
The gradient of the kernel k(X, X) with respect to the
2261+
The gradient of the kernel k(X, X) with respect to the log of the
22542262
hyperparameter of the kernel. Only returned when `eval_gradient`
22552263
is True.
22562264
"""

0 commit comments

Comments
 (0)