-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Added BCQ #378
base: master
Are you sure you want to change the base?
[WIP] Added BCQ #378
Conversation
Buffers have been tested but not after the addition of BCQ so tests are failing rn |
This pull request introduces 3 alerts when merging 6c271ef into b8a45ab - view on LGTM.com new alerts:
|
Codecov Report
@@ Coverage Diff @@
## master #378 +/- ##
==========================================
- Coverage 91.28% 88.51% -2.77%
==========================================
Files 90 93 +3
Lines 3809 3944 +135
==========================================
+ Hits 3477 3491 +14
- Misses 332 453 +121
|
This pull request introduces 4 alerts when merging b28c1e6 into a2c8c7e - view on LGTM.com new alerts:
|
This pull request introduces 4 alerts when merging 3db4733 into 25eb018 - view on LGTM.com new alerts:
|
This pull request introduces 4 alerts when merging 43a483e into 25eb018 - view on LGTM.com new alerts:
|
Stuff implemented:
OffPolicyAgentAC
. Architecture was very similar to TD3. Major differences were that the actor took in both state and action as input and the VAE obviously.OfflineTrainer
inherits fromOffPolicyTrainer
. Only difference is that it loads the buffer.BaseBuffer
and remove redundant functions and converted all code to torch. No numpy is used in any of the buffer files now.Stuff to do: