Skip to content

Commit 04aff3a

Browse files
authored
Define the new device parameter. (dmlc#9362)
1 parent 2d0cd28 commit 04aff3a

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

63 files changed

+825
-475
lines changed

CITATION

-1
Original file line numberDiff line numberDiff line change
@@ -15,4 +15,3 @@
1515
address = {New York, NY, USA},
1616
keywords = {large-scale machine learning},
1717
}
18-

doc/gpu/index.rst

+5-4
Original file line numberDiff line numberDiff line change
@@ -22,21 +22,22 @@ Supported parameters
2222
GPU accelerated prediction is enabled by default for the above mentioned ``tree_method`` parameters but can be switched to CPU prediction by setting ``predictor`` to ``cpu_predictor``. This could be useful if you want to conserve GPU memory. Likewise when using CPU algorithms, GPU accelerated prediction can be enabled by setting ``predictor`` to ``gpu_predictor``.
2323

2424
The device ordinal (which GPU to use if you have many of them) can be selected using the
25-
``gpu_id`` parameter, which defaults to 0 (the first device reported by CUDA runtime).
25+
``device`` parameter, which defaults to 0 when "CUDA" is specified(the first device reported by CUDA
26+
runtime).
2627

2728

2829
The GPU algorithms currently work with CLI, Python, R, and JVM packages. See :doc:`/install` for details.
2930

3031
.. code-block:: python
3132
:caption: Python example
3233
33-
param['gpu_id'] = 0
34+
param["device"] = "cuda:0"
3435
param['tree_method'] = 'gpu_hist'
3536
3637
.. code-block:: python
3738
:caption: With Scikit-Learn interface
3839
39-
XGBRegressor(tree_method='gpu_hist', gpu_id=0)
40+
XGBRegressor(tree_method='gpu_hist', device="cuda")
4041
4142
4243
GPU-Accelerated SHAP values
@@ -45,7 +46,7 @@ XGBoost makes use of `GPUTreeShap <https://github.com/rapidsai/gputreeshap>`_ as
4546

4647
.. code-block:: python
4748
48-
model.set_param({"gpu_id": "0", "tree_method": "gpu_hist"})
49+
model.set_param({"device": "cuda:0", "tree_method": "gpu_hist"})
4950
shap_values = model.predict(dtrain, pred_contribs=True)
5051
shap_interaction_values = model.predict(dtrain, pred_interactions=True)
5152

doc/install.rst

+4-4
Original file line numberDiff line numberDiff line change
@@ -3,10 +3,10 @@ Installation Guide
33
##################
44

55
XGBoost provides binary packages for some language bindings. The binary packages support
6-
the GPU algorithm (``gpu_hist``) on machines with NVIDIA GPUs. Please note that **training
7-
with multiple GPUs is only supported for Linux platform**. See :doc:`gpu/index`. Also we
8-
have both stable releases and nightly builds, see below for how to install them. For
9-
building from source, visit :doc:`this page </build>`.
6+
the GPU algorithm (``device=cuda:0``) on machines with NVIDIA GPUs. Please note that
7+
**training with multiple GPUs is only supported for Linux platform**. See
8+
:doc:`gpu/index`. Also we have both stable releases and nightly builds, see below for how
9+
to install them. For building from source, visit :doc:`this page </build>`.
1010

1111
.. contents:: Contents
1212

doc/parameter.rst

+20-19
Original file line numberDiff line numberDiff line change
@@ -59,6 +59,18 @@ General Parameters
5959

6060
- Feature dimension used in boosting, set to maximum dimension of the feature
6161

62+
* ``device`` [default= ``cpu``]
63+
64+
.. versionadded:: 2.0.0
65+
66+
- Device for XGBoost to run. User can set it to one of the following values:
67+
68+
+ ``cpu``: Use CPU.
69+
+ ``cuda``: Use a GPU (CUDA device).
70+
+ ``cuda:<ordinal>``: ``<ordinal>`` is an integer that specifies the ordinal of the GPU (which GPU do you want to use if you have more than one devices).
71+
+ ``gpu``: Default GPU device selection from the list of available and supported devices. Only ``cuda`` devices are supported currently.
72+
+ ``gpu:<ordinal>``: Default GPU device selection from the list of available and supported devices. Only ``cuda`` devices are supported currently.
73+
6274
Parameters for Tree Booster
6375
===========================
6476
* ``eta`` [default=0.3, alias: ``learning_rate``]
@@ -99,7 +111,7 @@ Parameters for Tree Booster
99111
- ``gradient_based``: the selection probability for each training instance is proportional to the
100112
*regularized absolute value* of gradients (more specifically, :math:`\sqrt{g^2+\lambda h^2}`).
101113
``subsample`` may be set to as low as 0.1 without loss of model accuracy. Note that this
102-
sampling method is only supported when ``tree_method`` is set to ``gpu_hist``; other tree
114+
sampling method is only supported when ``tree_method`` is set to ``hist`` and the device is ``cuda``; other tree
103115
methods only support ``uniform`` sampling.
104116

105117
* ``colsample_bytree``, ``colsample_bylevel``, ``colsample_bynode`` [default=1]
@@ -131,26 +143,15 @@ Parameters for Tree Booster
131143
* ``tree_method`` string [default= ``auto``]
132144

133145
- The tree construction algorithm used in XGBoost. See description in the `reference paper <http://arxiv.org/abs/1603.02754>`_ and :doc:`treemethod`.
134-
- XGBoost supports ``approx``, ``hist`` and ``gpu_hist`` for distributed training. Experimental support for external memory is available for ``approx`` and ``gpu_hist``.
135-
136-
- Choices: ``auto``, ``exact``, ``approx``, ``hist``, ``gpu_hist``, this is a
137-
combination of commonly used updaters. For other updaters like ``refresh``, set the
138-
parameter ``updater`` directly.
139-
140-
- ``auto``: Use heuristic to choose the fastest method.
141146

142-
- For small dataset, exact greedy (``exact``) will be used.
143-
- For larger dataset, approximate algorithm (``approx``) will be chosen. It's
144-
recommended to try ``hist`` and ``gpu_hist`` for higher performance with large
145-
dataset.
146-
(``gpu_hist``)has support for ``external memory``.
147+
- Choices: ``auto``, ``exact``, ``approx``, ``hist``, this is a combination of commonly
148+
used updaters. For other updaters like ``refresh``, set the parameter ``updater``
149+
directly.
147150

148-
- Because old behavior is always use exact greedy in single machine, user will get a
149-
message when approximate algorithm is chosen to notify this choice.
151+
- ``auto``: Same as the ``hist`` tree method.
150152
- ``exact``: Exact greedy algorithm. Enumerates all split candidates.
151153
- ``approx``: Approximate greedy algorithm using quantile sketch and gradient histogram.
152154
- ``hist``: Faster histogram optimized approximate greedy algorithm.
153-
- ``gpu_hist``: GPU implementation of ``hist`` algorithm.
154155

155156
* ``scale_pos_weight`` [default=1]
156157

@@ -163,7 +164,7 @@ Parameters for Tree Booster
163164
- ``grow_colmaker``: non-distributed column-based construction of trees.
164165
- ``grow_histmaker``: distributed tree construction with row-based data splitting based on global proposal of histogram counting.
165166
- ``grow_quantile_histmaker``: Grow tree using quantized histogram.
166-
- ``grow_gpu_hist``: Grow tree with GPU.
167+
- ``grow_gpu_hist``: Grow tree with GPU. Same as setting tree method to ``hist`` and use ``device=cuda``.
167168
- ``sync``: synchronizes trees in all distributed nodes.
168169
- ``refresh``: refreshes tree's statistics and/or leaf values based on the current data. Note that no random subsampling of data rows is performed.
169170
- ``prune``: prunes the splits where loss < min_split_loss (or gamma) and nodes that have depth greater than ``max_depth``.
@@ -183,7 +184,7 @@ Parameters for Tree Booster
183184
* ``grow_policy`` [default= ``depthwise``]
184185

185186
- Controls a way new nodes are added to the tree.
186-
- Currently supported only if ``tree_method`` is set to ``hist``, ``approx`` or ``gpu_hist``.
187+
- Currently supported only if ``tree_method`` is set to ``hist`` or ``approx``.
187188
- Choices: ``depthwise``, ``lossguide``
188189

189190
- ``depthwise``: split at nodes closest to the root.
@@ -195,7 +196,7 @@ Parameters for Tree Booster
195196

196197
* ``max_bin``, [default=256]
197198

198-
- Only used if ``tree_method`` is set to ``hist``, ``approx`` or ``gpu_hist``.
199+
- Only used if ``tree_method`` is set to ``hist`` or ``approx``.
199200
- Maximum number of discrete bins to bucket continuous features.
200201
- Increasing this number improves the optimality of splits at the cost of higher computation time.
201202

doc/treemethod.rst

+28-32
Original file line numberDiff line numberDiff line change
@@ -3,14 +3,14 @@ Tree Methods
33
############
44

55
For training boosted tree models, there are 2 parameters used for choosing algorithms,
6-
namely ``updater`` and ``tree_method``. XGBoost has 4 builtin tree methods, namely
7-
``exact``, ``approx``, ``hist`` and ``gpu_hist``. Along with these tree methods, there
8-
are also some free standing updaters including ``refresh``,
9-
``prune`` and ``sync``. The parameter ``updater`` is more primitive than ``tree_method``
10-
as the latter is just a pre-configuration of the former. The difference is mostly due to
11-
historical reasons that each updater requires some specific configurations and might has
12-
missing features. As we are moving forward, the gap between them is becoming more and
13-
more irrelevant. We will collectively document them under tree methods.
6+
namely ``updater`` and ``tree_method``. XGBoost has 3 builtin tree methods, namely
7+
``exact``, ``approx`` and ``hist``. Along with these tree methods, there are also some
8+
free standing updaters including ``refresh``, ``prune`` and ``sync``. The parameter
9+
``updater`` is more primitive than ``tree_method`` as the latter is just a
10+
pre-configuration of the former. The difference is mostly due to historical reasons that
11+
each updater requires some specific configurations and might has missing features. As we
12+
are moving forward, the gap between them is becoming more and more irrelevant. We will
13+
collectively document them under tree methods.
1414

1515
**************
1616
Exact Solution
@@ -19,23 +19,23 @@ Exact Solution
1919
Exact means XGBoost considers all candidates from data for tree splitting, but underlying
2020
the objective is still interpreted as a Taylor expansion.
2121

22-
1. ``exact``: Vanilla gradient boosting tree algorithm described in `reference paper
23-
<http://arxiv.org/abs/1603.02754>`_. During each split finding procedure, it iterates
24-
over all entries of input data. It's more accurate (among other greedy methods) but
25-
slow in computation performance. Also it doesn't support distributed training as
26-
XGBoost employs row spliting data distribution while ``exact`` tree method works on a
27-
sorted column format. This tree method can be used with parameter ``tree_method`` set
28-
to ``exact``.
22+
1. ``exact``: The vanilla gradient boosting tree algorithm described in `reference paper
23+
<http://arxiv.org/abs/1603.02754>`_. During split-finding, it iterates over all
24+
entries of input data. It's more accurate (among other greedy methods) but
25+
computationally slower in compared to other tree methods. Further more, its feature
26+
set is limited. Features like distributed training and external memory that require
27+
approximated quantiles are not supported. This tree method can be used with the
28+
parameter ``tree_method`` set to ``exact``.
2929

3030

3131
**********************
3232
Approximated Solutions
3333
**********************
3434

35-
As ``exact`` tree method is slow in performance and not scalable, we often employ
36-
approximated training algorithms. These algorithms build a gradient histogram for each
37-
node and iterate through the histogram instead of real dataset. Here we introduce the
38-
implementations in XGBoost below.
35+
As ``exact`` tree method is slow in computation performance and difficult to scale, we
36+
often employ approximated training algorithms. These algorithms build a gradient
37+
histogram for each node and iterate through the histogram instead of real dataset. Here
38+
we introduce the implementations in XGBoost.
3939

4040
1. ``approx`` tree method: An approximation tree method described in `reference paper
4141
<http://arxiv.org/abs/1603.02754>`_. It runs sketching before building each tree
@@ -48,22 +48,18 @@ implementations in XGBoost below.
4848
this global sketch. This is the fastest algorithm as it runs sketching only once. The
4949
algorithm can be accessed by setting ``tree_method`` to ``hist``.
5050

51-
3. ``gpu_hist`` tree method: The ``gpu_hist`` tree method is a GPU implementation of
52-
``hist``, with additional support for gradient based sampling. The algorithm can be
53-
accessed by setting ``tree_method`` to ``gpu_hist``.
54-
5551
************
5652
Implications
5753
************
5854

59-
Some objectives like ``reg:squarederror`` have constant hessian. In this case, ``hist``
60-
or ``gpu_hist`` should be preferred as weighted sketching doesn't make sense with constant
55+
Some objectives like ``reg:squarederror`` have constant hessian. In this case, the
56+
``hist`` should be preferred as weighted sketching doesn't make sense with constant
6157
weights. When using non-constant hessian objectives, sometimes ``approx`` yields better
62-
accuracy, but with slower computation performance. Most of the time using ``(gpu)_hist``
63-
with higher ``max_bin`` can achieve similar or even superior accuracy while maintaining
64-
good performance. However, as xgboost is largely driven by community effort, the actual
65-
implementations have some differences than pure math description. Result might have
66-
slight differences than expectation, which we are currently trying to overcome.
58+
accuracy, but with slower computation performance. Most of the time using ``hist`` with
59+
higher ``max_bin`` can achieve similar or even superior accuracy while maintaining good
60+
performance. However, as xgboost is largely driven by community effort, the actual
61+
implementations have some differences than pure math description. Result might be
62+
slightly different than expectation, which we are currently trying to overcome.
6763

6864
**************
6965
Other Updaters
@@ -106,8 +102,8 @@ solely for the interest of documentation.
106102
histogram creation step and uses sketching values directly during split evaluation. It
107103
was never tested and contained some unknown bugs, we decided to remove it and focus our
108104
resources on more promising algorithms instead. For accuracy, most of the time
109-
``approx``, ``hist`` and ``gpu_hist`` are enough with some parameters tuning, so
110-
removing them don't have any real practical impact.
105+
``approx`` and ``hist`` are enough with some parameters tuning, so removing them don't
106+
have any real practical impact.
111107

112108
3. ``grow_local_histmaker`` updater: An approximation tree method described in `reference
113109
paper <http://arxiv.org/abs/1603.02754>`_. This updater was rarely used in practice so

doc/tutorials/dask.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -149,7 +149,7 @@ Also for inplace prediction:
149149
.. code-block:: python
150150
151151
# where X is a dask DataFrame or dask Array backed by cupy or cuDF.
152-
booster.set_param({"gpu_id": "0"})
152+
booster.set_param({"device": "cuda:0"})
153153
prediction = xgb.dask.inplace_predict(client, booster, X)
154154
155155
When input is ``da.Array`` object, output is always ``da.Array``. However, if the input

doc/tutorials/saving_model.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -163,7 +163,7 @@ Will print out something similar to (not actual output as it's too long for demo
163163
{
164164
"Learner": {
165165
"generic_parameter": {
166-
"gpu_id": "0",
166+
"device": "cuda:0",
167167
"gpu_page_size": "0",
168168
"n_jobs": "0",
169169
"random_state": "0",

include/xgboost/base.h

+1-1
Original file line numberDiff line numberDiff line change
@@ -119,7 +119,7 @@ using bst_group_t = std::uint32_t; // NOLINT
119119
*/
120120
using bst_target_t = std::uint32_t; // NOLINT
121121
/**
122-
* brief Type for indexing boosted layers.
122+
* @brief Type for indexing boosted layers.
123123
*/
124124
using bst_layer_t = std::int32_t; // NOLINT
125125
/**

0 commit comments

Comments
 (0)