Fix broken by thomasflynn918 · Pull Request #267 · exalearn/EXARL

thomasflynn918 · 2023-12-19T22:05:02Z

No description provided.

New keras agent based on TD3

Create keras_td3.py

Conflicts: exarl/workflows/workflow_vault/async_learner.py exarl/workflows/workflow_vault/sync_learner.py

Data structures Comms Environments

2. Revert lookup params to run_params to have a single point of change

casting all variables in calc_target_f to float32

fix to config files params: model/model_type

* 1. Fixed spelling and added greedy throttle to SingleRun.pm 2. Fixed support for -A option (previously not being respected) 3. Ensured Tensorflow is not loaded when using Pytorch 4. Fixed log/results directory to increment instead of defaulting to RUN001 5. Added convergence cutoffs to workflows 6. Fixed/simplified step counting in workflows 7. Added hyper-parameter tuning using sbatch and optuna * 1. Flake8 2. Added fix to sync with episode block leading to deadlock 3. Commented hyper-parameter tuning script 4. Fixed RUN000 to the next dir if logs exist 5. Adjusted output to match hyper-parameter script * 1. Updating docs for workflows 2. Minor fix to bsuite_batch help message * 1. Added ability to load external agents and environments 2. Fixed print and end of run 3. Async convergence call had too many parameters

spaces.

… for end of episodes. Still touching some things up to get TD3-v1 working with the new fixBroken branch.

…0 work with the fix to the q target calculation and do great on the Pendulum test problem

… has dimension > 1.

…ist for no reason. Not sure why that was the case before but it works just fine now with the simpler fix

… works with Pendulum just as well as DDPG.

… testing however before pull request.

Fix done bug

The OUNoise call only works for a scalar action. Currently the noise is a scalar, so if the action space is bigger, it adds the same value to all actions. This fix ensures that the OUNoise matches the shape of the action space.

Update ddpg.py

…sted on the Pendulum case. Again, need to do some performance testing at some point, but the buffer does work on the Pendulum problem. Also added a default buffer for the TD3-v1 case.

Added the capability to do n-step DDPG-v0 and TD3-v1

…ize x self.num_actions, but the noise being sampled was only a self.num_actions length vector. So the same sampled noise was being added to each next_action in a batch. This fixes it. I also changed the noise shape in the action method as well. This one was currently fine, but ensuring it is the shape of sampled_actions ensures that it is the right shape. Having it as a self.num_actions vector could make it not robust to future changes of action space shape.

…the Dense layer output is sized wrong if the action space is of higher dimension than 1.

… is wrong unless the lower bound is equal to the upper bound. I switched to the sigmoid activation by default and then then scale it properly. Before, the output of the tanh layer was just multiplied by the upper_bound, so the output of the tanh layer was scaled to +/-upper_bound.

…e version of ExaRL that works with the old version of gym. I dont have the old version working to test it though.

Fixing a shape issue in TD3. In train_critic,

Adding Soft-Actor Critic method to the fixBroken branch

jmohdyusof and others added 30 commits December 2, 2021 10:19

File flake8 errors.

4e87d51

updated booster env

8b590ed

new ExaBooster env

06cbe22

adding new model

d33303d

Continuous ExaBooster env support

1502a0e

modular version of rma learner

f4aad2b

Adding TD3 agent

cef5859

cleaning up some code

ad1cb13

Create keras_td3.py

4405788

New keras agent based on TD3

Merge pull request #201 from schr476/patch-2

c8143f5

Create keras_td3.py

fixing style errors

1ccfc1e

adding new agents and revamping workflows

93b1098

debugging async learner

8206840

Merge branch 'develop' into develop_init_cleaning

4ec5844

Conflicts: exarl/workflows/workflow_vault/async_learner.py exarl/workflows/workflow_vault/sync_learner.py

Added ExaBooster Training Data

c184344

Adding tests for:

709b412

Data structures Comms Environments

Minor name change.

f0a6a63

Putting candle driver opts in agent test.

76a700a

Got agent unit test set up and cleaned up some comments.

ac1fd53

Add code to read CONFIG_DIR if set to allow flexible launch options.

f2260eb

Merge branch 'develop' of github.com:exalearn/ExaRL into develop

1480db1

Some progress on agent testing

b434be8

adding summit_environment.yml file for building a conda env on summit

aa43143

updating summit_environment.yml file

ba14650

1. Restructure to avoid circular dependencies in candleDriver

c82f053

2. Revert lookup params to run_params to have a single point of change

updating pytest learner_cfg.json

ed5212a

adding try catches around run_params in init

8531959

specifying gym version in setup.py:

2da3cb1

hadrec

e90abdb

powergrid

c27d5ee

rvinaybharadwaj and others added 29 commits July 6, 2022 09:56

casting all variables in calc_target_f to float32

49b323f

Merge pull request #231 from exalearn/dqn_tfcast

960114c

casting all variables in calc_target_f to float32

fix to config files params: model/model_type

11bf197

restoring learner_cfg for Cartpole/DQN

ea1e03e

Merge pull request #233 from exalearn/config_fix

c4c4eeb

fix to config files params: model/model_type

Changes for Tom to review.

2c48242

Second attempt at patch.

38bc44d

Adding .gitignore

234d970

Adding back the code to convert between continuous and discrete action

db1e707

spaces.

Adding fix in TD3-v1 and DDPG-v0 for the q_target to properly account…

57cd1d3

… for end of episodes. Still touching some things up to get TD3-v1 working with the new fixBroken branch.

Finished fixing TD3_v1 using keras_td3.py. Now both TD3-v1 and DDPG-v…

380b001

…0 work with the fix to the q target calculation and do great on the Pendulum test problem

Also need to address a bug in the replay buffer when the action space…

21a5fc2

… has dimension > 1.

Nevermind. The right change is for TD3-v1s actor not to give back a l…

1a636c8

…ist for no reason. Not sure why that was the case before but it works just fine now with the simpler fix

Forgot to commit the removal of the np.squeeze also. Now this version…

bd208d3

… works with Pendulum just as well as DDPG.

Committing the nStepBuffer that I believe is working, need to do more…

b5044c0

… testing however before pull request.

Merge pull request #257 from exalearn/fix_done_bug

9434c04

Fix done bug

Update ddpg.py

6f4d5f1

The OUNoise call only works for a scalar action. Currently the noise is a scalar, so if the action space is bigger, it adds the same value to all actions. This fix ensures that the OUNoise matches the shape of the action space.

Merge pull request #262 from exalearn/ddpg_OU_noise_patch

83e6f35

Update ddpg.py

Converting the nStep Buffer to using the buffer builder interface. Te…

6d33cb8

…sted on the Pendulum case. Again, need to do some performance testing at some point, but the buffer does work on the Pendulum problem. Also added a default buffer for the TD3-v1 case.

Merge pull request #261 from exalearn/nStep_capability

357e4fb

Added the capability to do n-step DDPG-v0 and TD3-v1

Also committing the fix to tf_ac.py that I mentioned in Slack - that …

cba9139

…the Dense layer output is sized wrong if the action space is of higher dimension than 1.

I think this should port the added Soft Actor-Critic capability to th…

99fb00d

…e version of ExaRL that works with the old version of gym. I dont have the old version working to test it though.

Adding the configuration files that I forgot to originally

eb56ef0

And add the parser argument to get it to work properly. Oops.

bc9e89e

Merge pull request #263 from exalearn/fix_td3_noise_shape

0df408b

Fixing a shape issue in TD3. In train_critic,

Merge pull request #265 from exalearn/add_sac_fixBroken

4c92810

Adding Soft-Actor Critic method to the fixBroken branch

thomasflynn918 requested a review from cmahrens December 19, 2023 22:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix broken#267

Fix broken#267
thomasflynn918 wants to merge 146 commits intomasterfrom
fixBroken

thomasflynn918 commented Dec 19, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

thomasflynn918 commented Dec 19, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants