[Epic] Data Pipeline for RLHF Tuning #392

RobotSail · 2024-12-07T04:05:00Z

In order to perform RLHF, we would like to collect feedback on human preference when tuning models. In order to accomplish this, there are a number of steps which must first be completed to allow the UI to support this.

We define the epic as follows:

The implementations are left as exercises for the reader

vishnoianil added enhancement labels Dec 17, 2024

vishnoianil added this to UI Dec 17, 2024

vishnoianil moved this to Backlog in UI Dec 17, 2024

vishnoianil added this to the release-1.2 milestone Dec 17, 2024

vishnoianil added help wanted Extra attention is needed and removed enhancement labels Feb 8, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Epic] Data Pipeline for RLHF Tuning #392

[Epic] Data Pipeline for RLHF Tuning #392

RobotSail commented Dec 7, 2024 •

edited

Loading

[Epic] Data Pipeline for RLHF Tuning #392

[Epic] Data Pipeline for RLHF Tuning #392

Comments

RobotSail commented Dec 7, 2024 • edited Loading

RobotSail commented Dec 7, 2024 •

edited

Loading