You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
LHS is a specific real number in [0,1] while on the RHS you have a probability distribution, don't you?
So I think it should be something like \pi (a|s) = P [A_t = a | S_t = s]. An alternative could be to write on RHS that it is the probability of choosing action a given state s.
The text was updated successfully, but these errors were encountered:
When you define stochastic policies, you write:
\pi (a|s) = P [A|s]
LHS is a specific real number in [0,1] while on the RHS you have a probability distribution, don't you?
So I think it should be something like \pi (a|s) = P [A_t = a | S_t = s]. An alternative could be to write on RHS that it is the probability of choosing action a given state s.
The text was updated successfully, but these errors were encountered: