Skip to content

Extracting total-loss, PPO-loss, rewards per step, returns per step in RLHF-PPO implementation #605

@Ritam-M

Description

@Ritam-M

📚 The doc issue

I need help extracting total-loss, PPO-loss, rewards per step, returns per step in RLHF-PPO implementation.

Suggest a potential alternative/fix

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentation

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions