Thanks for sharing your excellent work! I followed exactly the main_tutorial.md and installed the same environment using H800 GPUs, but ended up with increasing actor entropy as follows.

I wonder whether this is normal. Besides, my test score (around 0.39 at 100 steps and 0.43 at 950 steps) is lower than yours (0.458 at 100 steps). Is this related to the entropy issue?
