🤗 Models on Hugging Face | Blog | Website | CyberSec Eval Paper | Llama Guard Paper
This repo contains a fork of PurpleLlama for the ProSec project.
- Install the dependences of purplellama:
cd CybersecurityBenchmarks
pip install -r requirements.txt
# install cargo if not already installed
# sudo apt-get install cargo
# Then add $HOME/.cargo/bin to your PATH
cargo install weggli- Install vllm
pip install vllm==0.7.3Use the following commands to run experiments:
./eval-quick.sh <path to model> <name of the run> <vllm port>
# or
./eval-full.sh <path to model> <name of the run> <vllm port>For example,
./eval-quick.sh model-ckpts/prosec-phi3mini first-run 8001will run the evaluation with the model checkpoint at model-ckpts/prosec-phi3mini, name the run first-run, and host the vllm server on port 8001.
The two scripts have the same arguments and similar functionality.
eval-quick.sh runs the experiment on a small subset of the evaluation dataset, while eval-full.sh runs the full evaluation.
Both scripts will first host the given model on a local vllm server, and then call the scripts in purplellama to evaluate the hosted model.
The results will be saved in CybersecurityBenchmarks/datasets/instruct-stat.
After the evaluation is done, both scripts kill the vllm server to free up resources.
