From 4cc1afc303eb35bb4e83936b27533d378953e654 Mon Sep 17 00:00:00 2001 From: Moritz Schmidt Date: Wed, 7 Aug 2024 09:58:11 +0200 Subject: [PATCH] Fix typos --- docs/soar_manual/05_ReinforcementLearning.md | 4 ++-- docs/tutorials/soar_tutorial/06.md | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/soar_manual/05_ReinforcementLearning.md b/docs/soar_manual/05_ReinforcementLearning.md index 7d82e12d..7f1d959f 100644 --- a/docs/soar_manual/05_ReinforcementLearning.md +++ b/docs/soar_manual/05_ReinforcementLearning.md @@ -299,7 +299,7 @@ An example walkthrough of a Sarsa update with $\alpha = 0.3$ and $\gamma = 0.9$ The previous description had assumed that RL operators were selected in both decision cycles $t$ and $t+1$. If the operator selected in $t+1$ is not an RL operator, -then $Q(s_{t+1},a_{t+1})$ would not be defined, and an update for the RL operator +then $Q(s_{t+1}, a_{t+1})$ would not be defined, and an update for the RL operator selected at time $t$ will be undefined. We will call a sequence of one or more decision cycles in which RL operators are not selected between two decision cycles in which RL operators are selected a gap. Conceptually, it is desirable @@ -332,7 +332,7 @@ RL operator. Gap propagation can be disabled by setting the **temporal-extension** parameter of the [`rl` command](../reference/cli/cmd_rl.md) to off. When gap propagation -is disabled, the RL rules preceding a gap are updated using $Q(s_{t+1},a_{t+1}) +is disabled, the RL rules preceding a gap are updated using $Q(s_{t+1}, a_{t+1}) = 0$. The rl setting of the [`watch`](../reference/cli/cmd_trace.md) command is useful in identifying gaps. diff --git a/docs/tutorials/soar_tutorial/06.md b/docs/tutorials/soar_tutorial/06.md index 99c9b4f7..796cb48f 100644 --- a/docs/tutorials/soar_tutorial/06.md +++ b/docs/tutorials/soar_tutorial/06.md @@ -221,7 +221,7 @@ the *Agents* directory. ### RL Rules -Rules that are recognized as updateable by the RL mechanism must abide +Rules that are recognized as updatable by the RL mechanism must abide by a specific syntax: ```Soar