I would like to express my sincere gratitude for the codes you have shared, which have been incredibly helpful for my research.
I have a question regarding the implementation of the RMAPPO algorithm. Specifically, does it imply that after the three-layer MLP, a GRU layer is added, followed by an output layer for training? Additionally, I would like to confirm if the actor and critic networks should have the same structure.
Thank you very much for your reply. Your response would be greatly beneficial to my research.
I would like to express my sincere gratitude for the codes you have shared, which have been incredibly helpful for my research.
I have a question regarding the implementation of the RMAPPO algorithm. Specifically, does it imply that after the three-layer MLP, a GRU layer is added, followed by an output layer for training? Additionally, I would like to confirm if the actor and critic networks should have the same structure.
Thank you very much for your reply. Your response would be greatly beneficial to my research.