-
Notifications
You must be signed in to change notification settings - Fork 130
Special axes
Albert Zeyer edited this page Nov 9, 2021
·
5 revisions
Special axes of Data are time_dim_axis
and feature_dim_axes
, which have some special meaning for some layers and operations in RETURNN.
(batch_dim_axis
as well, but it is not really ambiguous, so we don't cover that much here.)
(Related: see issue #586 on whether time_dim_axis
and feature_dim_axes
should be removed.)
Some operations or layers operate on spatial axes, which are defined as:
[
axis
for axis in range(self.batch_ndim)
if axis != self.batch_dim_axis
and (axis != self.feature_dim_axis or
axis == self.time_dim_axis or
self.batch_shape[axis] is None)]
This returns all axes except of the feature dim axis and batch dim axis. But further, if feature dim is same as time dim, or feature dim is dynamic, it would also include that.
A list of layers which make use of special axes or spatial axes:
- All layers deriving from
_ConcatInputLayer
, which concatenate multiple inputs in the feature dimension, thus usingfeature_dim_axis
(only used when multiple inputs are actually passed, otherwise ignored). -
LinearLayer
operates on the feature dim, usingfeature_dim_axis
, or operates on sparse tensors. It accepts any tensor of any number of dimensions, as long as it has a feature dim, or is sparse (with finite number of classes). -
RecLayer
operates on the feature dim (usingfeature_dim_axis
) and iterates over the time dim (time_dim_axis
). Most builtin units (e.g. LSTM) expect a 3D input (batch, time, feature, in any order). A rec subnet can operate on anything as long as it has a time dim. -
ConvLayer
,TransposedConvLayer
,PoolLayer
operate on spatial axes and on the feature dim. -
GenericAttentionLayer
makes use of the time dim and feature dim of thebase
. -
SoftmaxOverSpatialLayer
usestime_dim_axis
by default, although you can also explicitly specify the axis. - Some layers use input
time_dim_axis
to determine whether they run in a recurrent loop. E.g.CumsumLayer
. This is a somewhat special meaning of the (default) argumentaxis="T"
. - ... (this list is incomplete)