Thanks for the very sophisticated code.
I applied the command(python main.py --env MountainCar-v0) to run the code. The following figure shows part of the result. Apparently, the performance of evolutionary algorithm is much better than gradient descent method(DDQN).

ERL, as the paper said, retains the experiences from the entire evolutionary population in the replay buffer and uses them to update DDQN parameters by gradient descent. Are there big differences between the performance of evolutionary algorithm and DDQN, or am I misunderstanding the code?