Checkpointing support with ray Tune

It would be nice to make `modelfree.hyperparams.train_rl` a tune.Trainable rather than a function, adding checkpointing support. This would let us use the HyperBand and Population Based Training schedulers. Conceptually this is easy enough: we already supporting saving models via `save_callbacks`, and can restore using `load_path`. However, the interfaces don't quite line up: Ray expects `_train` to perform one small training step, with `_save` called in between. There's no good way to make Stable Baselines return part-way. We could call it repeatedly with small `total_timesteps`, but this would make the progress be wrong, breaking annealers.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Checkpointing support with ray Tune #5

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Checkpointing support with ray Tune #5

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions