diff --git a/examples/univariate_forecasting_with_exogenous_variables.ipynb b/examples/univariate_forecasting_with_exogenous_variables.ipynb index cd5d0bdb4c..263df8b387 100644 --- a/examples/univariate_forecasting_with_exogenous_variables.ipynb +++ b/examples/univariate_forecasting_with_exogenous_variables.ipynb @@ -589,14 +589,14 @@ "metadata": {}, "source": [ "### Conditional Mean uses target alignment\n", - "When modeling the conditional mean in an AR-X, HAR-X, or LS model, the $X$ data is target-aligned. This requires that when modeling the mean of ``y[t]``, the correct values of $X$ must appear in ``x[t]``. Mathematically, the $X$ matrix used when estimating a model should have the structure\n", + "When modeling the conditional mean in an AR-X, HAR-X, or LS model, the $X$ data is target-aligned. This requires that when modeling the mean of ``y[t]``, the correct values of $X$ must appear in ``x[t]``. Mathematically, the $X$ matrix used when estimating a model should have the structure (using the Python indexing convention of a T-element data set having indices 0, 1, ..., T-1):\n", "\n", "$$\n", "\\left[\\begin{array}{c}\n", "X_{0}\\\\\n", "X_{1}\\\\\n", "\\vdots\\\\\n", - "X_{t-1}\n", + "X_{T-1}\n", "\\end{array}\\right]\n", "$$\n", "\n", @@ -609,22 +609,24 @@ "\n", "$$\n", "\\left[\\begin{array}{cccc}\n", - "E\\left[X_{1|0}\\right] & E\\left[X_{2|0}\\right] & \\ldots & E\\left[X_{h|0}\\right]\\\\\n", - "E\\left[X_{2|1}\\right] & E\\left[X_{3|1}\\right] & \\ldots & E\\left[X_{h+1|1}\\right]\\\\\n", + "E\\left[X_{1}|\\mathcal{F}_0\\right] & E\\left[X_{2}|\\mathcal{F}_0\\right] & \\ldots & E\\left[X_{h|\\mathcal{F}_0}\\right]\\\\\n", + "E\\left[X_{2}|\\mathcal{F}_1\\right] & E\\left[X_{3}|\\mathcal{F}_0\\right] & \\ldots & E\\left[X_{h+1}|\\mathcal{F}_1\\right]\\\\\n", "\\vdots & \\vdots & \\vdots & \\vdots\\\\\n", - "E\\left[X_{T|T-1}\\right] & E\\left[X_{T+1|T-1}\\right] & \\ldots & E\\left[X_{T+h-1|T-1}\\right]\n", + "E\\left[X_{T}|\\mathcal{F}_{T-1}\\right] & E\\left[X_{T+1}|\\mathcal{F}_{T-1}\\right] & \\ldots & E\\left[X_{T+h-1}|\\mathcal{F}_{T-1}\\right]\n", "\\end{array}\\right]\n", "$$\n", "\n", + "where $|\\mathcal{F}_{s}$ is the time-$s$ information set.\n", + "\n", "If you use the same ``x`` value in the model when forecasting, you will see different values due to this alignment difference. Naively using the same ``x`` values ie equivalent to setting\n", "\n", - "$$ E\\left[X_{j|j-1}] \\right] = X_{j-1} $$\n", + "$$ E\\left[X_{s}|\\mathcal{F}_{s-1} \\right] = X_{s-1} $$\n", "\n", "In general this would not be correct when forecasting, and will always produce forecasts that differ from the conditional mean. In order to recover the conditional mean using the forecast function, it is necessary to ``shift`` the $X$ values by -1, so that once shifted, the ``x`` values will have the relationship\n", "\n", - "$$ E\\left[X_{j|j-1}] \\right] = X_{j} .$$\n", + "$$ E\\left[X_{s}|\\mathcal{F}_{s-1} \\right] = X_{s} .$$\n", "\n", - "Here we shift the $X$ data by ``-1`` so ``x[j]`` is treated as being in the information set for ``y[j-1]``. Also note that the final forecast is ``NaN``. Conceptually this must be the case because the value of $X$ at 999 should be ahead of 999 (i.e., at observation 1,000), and we do not have this value. " + "Here we shift the $X$ data by ``-1`` so ``x[s]`` is treated as being in the information set for ``y[s-1]``. Also, note that the final forecast is ``NaN``. Conceptually this must be the case because the value of $X$ at 999 should be ahead of 999 (i.e., at observation 1,000), and we do not have this value. " ] }, {