Skip to content

Commit

Permalink
Merge pull request #44 from stanfordnlp/dev_zeta
Browse files Browse the repository at this point in the history
[Major] Zeta version
  • Loading branch information
frankaging authored Apr 18, 2024
2 parents 5cb58be + 6d518fa commit 531e7f9
Show file tree
Hide file tree
Showing 26 changed files with 1,774 additions and 244 deletions.
9 changes: 9 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,14 @@ test.py
examples/loreft/dataset
memo_*.png
*.json
examples/safety/jail*
examples/safety/*.csv
examples/composition/compreft.py
examples/gradio/reft_*/
*/train_and_share*
examples/agent/reft_to_share/
examples/agent/train_and_share.ipynb
*/reft_to_share/
analyse.py
data.py
datasets/
Expand All @@ -23,6 +31,7 @@ templates.py
trainer.py
tmp/
*.DS_Store
examples/reward/reward/


# Byte-compiled / optimized / DLL files
Expand Down
9 changes: 9 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
[submodule "examples/gradio/prod/reft_goody2"]
path = examples/agent/prod/reft_goody2
url = https://huggingface.co/spaces/pyvene/reft_goody2
[submodule "examples/gradio/prod/reft_chat7b"]
path = examples/agent/prod/reft_chat7b
url = https://huggingface.co/spaces/pyvene/reft_chat7b
[submodule "examples/agent/prod/reft_emoji_chat"]
path = examples/agent/prod/reft_emoji_chat
url = https://huggingface.co/spaces/pyvene/reft_emoji_chat
13 changes: 8 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,13 +13,13 @@ Want to try a fine-tuning method that uses a fraction of the parameter count of
- Sharing the fine-tuned results easily to HuggingFace

> [!TIP]
> **A Short Video Introducing ReFT:** Watch [the video from Youtube](https://www.youtube.com/watch?v=GK2kritsbbM)!
> **Building ReFT LM-Agent in Minutes:** Checkout our tutorial on using ReFT to adapt LMs with a few demonstrations at [ReFT-Agent](https://github.com/stanfordnlp/pyreft/tree/main/examples/agent)!
> [!TIP]
> **Powerful and Parameter-Efficient:** Read [Our ReFT paper](https://arxiv.org/abs/2404.03592) for an introduction of representation fine-tuning (ReFT) and its performance.
> **Our ReFT-Chat (instruct-tuned for 18 mins and a single GPU) is hosted live on** [HuggingFace Space](https://huggingface.co/spaces/pyvene/reft_chat7b_1k)!
> [!TIP]
> **Intepretable Finetuning:** Read [Composable ReFT](https://github.com/stanfordnlp/pyreft/tree/main/examples/composition) for a sneak-peek of the interpretable nature of ReFT.
> **A Short Video Introducing ReFT:** Watch [the video from Youtube](https://www.youtube.com/watch?v=GK2kritsbbM)!
## Quickstart

Expand Down Expand Up @@ -168,7 +168,7 @@ completes the request.
device = "cuda" if torch.cuda.is_available() else "cpu"

model_name_or_path = "meta-llama/Llama-2-7b-hf"
reft_model_name_or_path = "zhengxuanzenwu/Loreft1k-Llama-2-7b-hf"
reft_model_name_or_path = "pyvene/reft_chat7b_1k"
tokenizer = transformers.AutoTokenizer.from_pretrained(
model_name_or_path, model_max_length=2048, padding_side="right", use_fast=False)
tokenizer.pad_token = tokenizer.unk_token
Expand All @@ -181,7 +181,7 @@ Then, loading ReFT artifacts:

```py
reft_model = ReftModel.load(
"zhengxuanzenwu/Loreft1k-Llama-2-7b-hf", model, from_huggingface_hub=True)
reft_model_name_or_path, model, from_huggingface_hub=True)
reft_model.set_device(device)
```

Expand Down Expand Up @@ -227,6 +227,9 @@ We showcase ReFT performance on various benchmarks against popular PEFTs such as
| [Alpaca](https://github.com/stanfordnlp/pyreft/tree/main/examples/alpaca) | Instruction-tune LMs with ReFT |
| [ReFT Interp](https://github.com/stanfordnlp/pyreft/tree/main/examples/memorisation) | Some hints on why ReFT works |
| [Composable ReFT](https://github.com/stanfordnlp/pyreft/tree/main/examples/composition) | Some why ReFT is an interpretable method |
| [Reward Modeling w/ ReFT](https://github.com/stanfordnlp/pyreft/tree/main/examples/reward) | Reward Model with ReFT |
| [Safety w/ ReFT](https://github.com/stanfordnlp/pyreft/tree/main/examples/safety) | Guardrail with ReFT |
| [LM-Agent w/ ReFT](https://github.com/stanfordnlp/pyreft/tree/main/examples/agent) | Train and Deploy Your ReFT in Minutes |

## Citation
Make sure you cite the **ReFT** paper:
Expand Down
16 changes: 16 additions & 0 deletions examples/agent/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
# Train ReFT Agents in Few-shot Settings, and Deploy Them with Gradio

Training is based on the notebook [`train_and_share.ipynb`](https://github.com/stanfordnlp/pyreft/blob/main/examples/agent/train_and_share.ipynb).

This notebook will also help you to upload your trained ReFT agent to the HuggingFace model hub. Your agent can be shared with others easily.


## Our Ethos-Chat (A GOODY-2 Imitator)

Deployed gradio model can be found [here](https://huggingface.co/spaces/pyvene/reft_ethos).


## Our Chat-model

Deployed gradio model can be found [here](https://huggingface.co/spaces/pyvene/reft_chat7b).

1 change: 1 addition & 0 deletions examples/agent/prod/reft_chat7b
Submodule reft_chat7b added at f502fc
1 change: 1 addition & 0 deletions examples/agent/prod/reft_emoji_chat
Submodule reft_emoji_chat added at 52b0ee
1 change: 1 addition & 0 deletions examples/agent/prod/reft_goody2
Submodule reft_goody2 added at b7894d
Loading

0 comments on commit 531e7f9

Please sign in to comment.