Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Algorithm 1 in the paper #3

Open
ndvbd opened this issue Aug 13, 2024 · 1 comment
Open

Algorithm 1 in the paper #3

ndvbd opened this issue Aug 13, 2024 · 1 comment

Comments

@ndvbd
Copy link

ndvbd commented Aug 13, 2024

In algo 1, the running mean and variance is updated at step 12, but not used anywhere.
Can you elaborate please?

@samlobel
Copy link
Owner

samlobel commented Nov 23, 2024

Hi, it's not fully explicit in the algorithm box but section 3.5.2 (link) explains in detail. We use that to normalize each dimension of the random prior's output. That way, if the learned component outputs 0 (which it may do for things you've never seen), the initial bonus is still 1, which is roughly the behavior you want on totally novel observations.

Hope that clears it up, and sorry for the slow response -- I didn't see the comment until just now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants