You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've run simple benchmarks to measure the impact of different settings on inference throughput when using DeepAR.
Key findings:
prediction length (and context length) have a linear impact
larger batch sizes can significantly speed up inference
dynamic features have a minor impact on inference performance
Since DeepAR is an RNN, the first finding is not surprising. The RNN has to be unrolled for each time step, and that happens prediction_length + context_length times.
The impact on batch size is a lot more interesting: Using very small batch sizes incurs a significant performance benefit. In extreme cases the difference is in the order of a magnitude. Thus, when predicting many time series it is especially important to invoke the model in batches.
Lastly, it appears that dynamic features are very cheap with respect to runtime cost in inference. For example, inference time was increased by just ~10% when increasing the number of dynamic features from 1 to 50.
Impact of batch size
MXNet DeepAR
Note: The chart shows total inference time in log scale. The table shows average timing in millisecond for one time step.
Torch DeepAR
Note: The chart shows total inference time in log scale. The table shows average timing in millisecond for one time step.
Impact of number of dynamic features
Torch DeepAR
Experiment Setup
For each configuration a basic model was trained and then used to do inference. The input data consists of 128 time series with 1000 random values each. For inference, the input data is split into the batches of the required batch size. The time is measured how long it takes to run inference on the entire input data using model.predict(...).
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Summary
I've run simple benchmarks to measure the impact of different settings on inference throughput when using DeepAR.
Key findings:
Since DeepAR is an RNN, the first finding is not surprising. The RNN has to be unrolled for each time step, and that happens
prediction_length + context_length
times.The impact on batch size is a lot more interesting: Using very small batch sizes incurs a significant performance benefit. In extreme cases the difference is in the order of a magnitude. Thus, when predicting many time series it is especially important to invoke the model in batches.
Lastly, it appears that dynamic features are very cheap with respect to runtime cost in inference. For example, inference time was increased by just ~10% when increasing the number of dynamic features from
1
to50
.Impact of batch size
MXNet DeepAR
Note: The chart shows total inference time in log scale. The table shows average timing in millisecond for one time step.
Torch DeepAR
Note: The chart shows total inference time in log scale. The table shows average timing in millisecond for one time step.
Impact of number of dynamic features
Torch DeepAR
Experiment Setup
For each configuration a basic model was trained and then used to do inference. The input data consists of
128
time series with1000
random values each. For inference, the input data is split into the batches of the required batch size. The time is measured how long it takes to run inference on the entire input data usingmodel.predict(...)
.Beta Was this translation helpful? Give feedback.
All reactions