-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Hi, thank you for the great work and code release.
I noticed that in your code, the agent is pretrained on the walker-walk task for 10 million steps. However, based on the evaluation rewards, it seems that the agent learns well before reaching 10M steps.
So I tried performing fine-tuning after pretraining for only 500k steps, but the performance was significantly worse.
Is there a reason why fine-tuning works better after the full 10M step pretraining?
Is it feasible to use a model pretrained for fewer steps (e.g., 1M or 2M) for fine-tuning without a significant drop in downstream performance? Or is the full pretraining necessary for good transferability?
Also, would it be possible to get access to the pretrained model weights?
Thanks in advance!
