Skip to content

Commit 725f8f6

Browse files
committed
Update documentation
1 parent 9108cc5 commit 725f8f6

File tree

4 files changed

+77
-32
lines changed

4 files changed

+77
-32
lines changed

docs/source/user/mip-models.rst

Lines changed: 29 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -57,29 +57,45 @@ function. By default, the approximation guarantees a maximal error of
5757
keyword argument when the constraints is added.
5858

5959

60-
Neural Networks
61-
===============
60+
Sequential Neural Networks
61+
==========================
6262

63-
The package currently models dense neural network with ReLU activations. For a
64-
given neuron the relation between its inputs and outputs is given by:
63+
The package supports sequential neural networks. Layers are added as building
64+
blocks; the package creates the necessary variables and constraints and wires
65+
them to match the network structure.
66+
67+
Dense layers (details)
68+
----------------------
69+
70+
For dense layers with ReLU activations, each neuron applies an affine
71+
transformation followed by a ReLU. For a neuron with weights
72+
\(\beta \in \mathbb{R}^{p+1}\), inputs \(x\), and output \(y\):
6573

6674
.. math::
6775
68-
y = \max(\sum_{i=1}^p \beta_i x_i + \beta_0, 0).
76+
y = \max\Big(\sum_{i=1}^p \beta_i x_i + \beta_0,\; 0\Big).
6977
70-
The relationship is formulated in the optimization model by using Gurobi
71-
:math:`max` `general constraint
72-
<https://www.gurobi.com/documentation/latest/refman/constraints.html#subsubsection:GeneralConstraints>`_
73-
with:
78+
This is modeled using Gurobi general constraints by introducing an auxiliary
79+
variable \(\omega\) for the affine part and then enforcing the ReLU:
7480

7581
.. math::
7682
77-
& \omega = \sum_{i=1}^p \beta_i x_i + \beta_0
83+
&\omega = \sum_{i=1}^p \beta_i x_i + \beta_0,\\
84+
&y = \max(\omega, 0).
85+
86+
Other layers (summary)
87+
----------------------
7888

79-
& y = \max(\omega, 0)
89+
- Conv2D and MaxPooling2D: supported with padding equivalent to ``valid`` only
90+
(no non‑zero or ``same`` padding). Strides are supported. Internally, tensors
91+
use channels‑last layout (NHWC) in the optimization model.
92+
- Flatten: converts a 4D (NHWC) tensor to 2D (batch, features).
93+
- Dropout: accepted but ignored at inference time (treated as identity).
8094

81-
with :math:`\omega` an auxiliary free variable. The neurons are then connected
82-
according to the topology of the network.
95+
Notes:
96+
- Keras models use NHWC throughout. PyTorch models are evaluated in NCHW, but
97+
the package handles the necessary internal conversions so predicted values
98+
match the framework’s behavior.
8399

84100

85101
Decision Tree Regression

docs/source/user/start.rst

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -154,6 +154,16 @@ For a simple example on how to use the package please refer to
154154
in the :doc:`../auto_examples/index` section.
155155

156156

157+
.. note::
158+
159+
Variable shapes: For tabular models (scikit-learn, tree ensembles, dense
160+
neural nets), inputs are typically 2D MVars with shape ``(batch, features)``
161+
and outputs are 1D or 2D (the package orients a 1D output based on the
162+
batch size). For convolutional neural networks (Keras/PyTorch), inputs can be
163+
4D MVars with shape ``(batch, H, W, C)`` (channels-last). A 3D input of shape
164+
``(H, W, C)`` is automatically interpreted as a single-batch input.
165+
166+
157167
.. rubric:: Footnotes
158168

159169
.. [#] Classification models are currently not supported (except binary logistic

docs/source/user/supported.rst

Lines changed: 36 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -99,27 +99,46 @@ Keras
9999
They can be formulated in a Gurobi model with the function
100100
:py:func:`add_keras_constr <gurobi_ml.keras.add_keras_constr>`.
101101

102-
Currently, only two types of layers are supported:
103-
104-
* `Dense layers <https://keras.io/api/layers/core_layers/dense/>`_ (possibly
105-
with `relu` activation),
106-
* `ReLU layers <https://keras.io/api/layers/activation_layers/relu/>`_ with
107-
default settings.
102+
Supported layers and notes:
103+
104+
- `Dense <https://keras.io/api/layers/core_layers/dense/>`_ with activation
105+
``relu`` or ``linear``.
106+
- `ReLU <https://keras.io/api/layers/activation_layers/relu/>`_ with default
107+
settings (no negative_slope/threshold/max_value variations).
108+
- `Conv2D <https://keras.io/api/layers/convolution_layers/convolution2d/>`_
109+
with activation ``relu`` or ``linear`` and padding ``valid`` only (no
110+
``same`` padding). Strides are supported.
111+
- `MaxPooling2D <https://keras.io/api/layers/pooling_layers/max_pooling2d/>`_
112+
with padding ``valid`` only.
113+
- `Flatten <https://keras.io/api/layers/reshaping_layers/flatten/>`_.
114+
- `Dropout <https://keras.io/api/layers/regularization_layers/dropout/>`_ is
115+
accepted but ignored at inference time (treated as identity).
116+
117+
Input tensors for CNNs use channels-last layout (NHWC). Flatten converts 4D
118+
NHWC tensors to 2D (batch, features).
108119

109120
PyTorch
110121
-------
111122

112-
113-
In PyTorch, only :external+torch:py:class:`torch.nn.Sequential` objects are
114-
supported.
115-
116-
They can be formulated in a Gurobi model with the function
117-
:py:func:`add_sequential_constr <gurobi_ml.torch.sequential.add_sequential_constr>`.
118-
119-
Currently, only two types of layers are supported:
120-
121-
* :external+torch:py:class:`Linear layers <torch.nn.Linear>`,
122-
* :external+torch:py:class:`ReLU layers <torch.nn.ReLU>`.
123+
In PyTorch, :external+torch:py:class:`torch.nn.Sequential` models are supported
124+
via :py:func:`add_sequential_constr <gurobi_ml.torch.sequential.add_sequential_constr>`.
125+
126+
Supported layers and notes:
127+
128+
- :external+torch:py:class:`Linear <torch.nn.Linear>`.
129+
- :external+torch:py:class:`ReLU <torch.nn.ReLU>`.
130+
- :external+torch:py:class:`Conv2d <torch.nn.Conv2d>` with padding equivalent
131+
to ``valid`` only (no non-zero padding or ``same``), strides supported.
132+
- :external+torch:py:class:`MaxPool2d <torch.nn.MaxPool2d>` with padding
133+
equivalent to ``valid`` only.
134+
- :external+torch:py:class:`Flatten <torch.nn.Flatten>`.
135+
- :external+torch:py:class:`Dropout <torch.nn.Dropout>` is accepted and
136+
ignored at inference time (identity).
137+
138+
Input tensors for CNNs are provided as NHWC variables. Internally, inputs are
139+
converted to NCHW for PyTorch evaluation and converted back for error checks.
140+
The first Linear after a Flatten layer is adjusted to account for PyTorch’s
141+
NCHW flatten order so that predictions match exactly.
123142

124143
XGBoost
125144
-------

src/gurobi_ml/modeling/_var_utils.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -315,8 +315,8 @@ def validate_input_vars(model, gp_vars, accepted_dim=(1, 2)):
315315
if mv.ndim == 3:
316316
return (mv.reshape((1,) + mv.shape), None, None)
317317
raise ParameterError(
318-
"Variables should be an MVar of dimension {}".format(
319-
" or ".join([f"{d}" for d in accepted_dim])
318+
"Variables should be an MVar of dimension {} and is dimension {}".format(
319+
" or ".join([f"{d}" for d in accepted_dim]), mv.ndim
320320
)
321321
)
322322

0 commit comments

Comments
 (0)