-
Notifications
You must be signed in to change notification settings - Fork 0
Description
This epic is used to describe and establish a prompt tuning process. The goal is to have a repeatable process with using a test set, eval criteria and scoring so we can see if our change actually improved anything. to do this we need minimal of the following:
- baseline vs control prompt and versioning (easy enough but good convention to establish)
- scope of change proposed. Includes all items targeted ie. number/dates preservation, forbidden behaviors, etc
- scoring lock (we have this but need to lock down to measure effectively)
- create standard test set. We have a lot of this but it is important to make sure that we are consistently evaluating against the same set
Some rules to follow:
Prompt files stored w/version id ex:
system_prompt_en.v1.txt
system_prompt_en.v2.txt