Algorithm 1 in the paper #3

ndvbd · 2024-08-13T07:54:56Z

In algo 1, the running mean and variance is updated at step 12, but not used anywhere.
Can you elaborate please?

samlobel · 2024-11-23T19:34:38Z

Hi, it's not fully explicit in the algorithm box but section 3.5.2 (link) explains in detail. We use that to normalize each dimension of the random prior's output. That way, if the learned component outputs 0 (which it may do for things you've never seen), the initial bonus is still 1, which is roughly the behavior you want on totally novel observations.

Hope that clears it up, and sorry for the slow response -- I didn't see the comment until just now.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Algorithm 1 in the paper #3

Algorithm 1 in the paper #3

ndvbd commented Aug 13, 2024

samlobel commented Nov 23, 2024 •

edited

Loading

Algorithm 1 in the paper #3

Algorithm 1 in the paper #3

Comments

ndvbd commented Aug 13, 2024

samlobel commented Nov 23, 2024 • edited Loading

samlobel commented Nov 23, 2024 •

edited

Loading