Skip to content

Fix broken#267

Open
thomasflynn918 wants to merge 146 commits intomasterfrom
fixBroken
Open

Fix broken#267
thomasflynn918 wants to merge 146 commits intomasterfrom
fixBroken

Conversation

@thomasflynn918
Copy link
Collaborator

No description provided.

jmohdyusof and others added 30 commits December 2, 2021 10:19
New keras agent based on TD3
Conflicts:
	exarl/workflows/workflow_vault/async_learner.py
	exarl/workflows/workflow_vault/sync_learner.py
Data structures
Comms
Environments
2. Revert lookup params to run_params to have a single point of change
rvinaybharadwaj and others added 29 commits July 6, 2022 09:56
casting all variables in calc_target_f to float32
fix to config files params: model/model_type
* 1. Fixed spelling and added greedy throttle to SingleRun.pm
2. Fixed support for -A option (previously not being respected)
3. Ensured Tensorflow is not loaded when using Pytorch
4. Fixed log/results directory to increment instead of defaulting to
RUN001
5. Added convergence cutoffs to workflows
6. Fixed/simplified step counting in workflows
7. Added hyper-parameter tuning using sbatch and optuna

* 1. Flake8
2. Added fix to sync with episode block leading to deadlock
3. Commented hyper-parameter tuning script
4. Fixed RUN000 to the next dir if logs exist
5. Adjusted output to match hyper-parameter script

* 1. Updating docs for workflows
2. Minor fix to bsuite_batch help message

* 1. Added ability to load external agents and environments
2. Fixed print and end of run
3. Async convergence call had too many parameters
… for end of episodes. Still touching some things up to get TD3-v1 working with the new fixBroken branch.
…0 work with the fix to the q target calculation and do great on the Pendulum test problem
…ist for no reason. Not sure why that was the case before but it works just fine now with the simpler fix
The OUNoise call only works for a scalar action. Currently the noise is a scalar, so if the action space is bigger, it adds the same value to all actions. 

This fix ensures that the OUNoise matches the shape of the action space.
…sted on the Pendulum case. Again, need to do some performance testing at some point, but the buffer does work on the Pendulum problem. Also added a default buffer for the TD3-v1 case.
Added the capability to do n-step DDPG-v0 and TD3-v1
…ize x self.num_actions, but the noise being sampled was only a self.num_actions length vector. So the same sampled noise was being added to each next_action in a batch. This fixes it. I also changed the noise shape in the action method as well. This one was currently fine, but ensuring it is the shape of sampled_actions ensures that it is the right shape. Having it as a self.num_actions vector could make it not robust to future changes of action space shape.
…the Dense layer output is sized wrong if the action space is of higher dimension than 1.
… is wrong unless the lower bound is equal to the upper bound. I switched to the sigmoid activation by default and then then scale it properly. Before, the output of the tanh layer was just multiplied by the upper_bound, so the output of the tanh layer was scaled to +/-upper_bound.
…e version of ExaRL that works with the old version of gym. I dont have the old version working to test it though.
Fixing a shape issue in TD3. In train_critic,
Adding Soft-Actor Critic method to the fixBroken branch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants