Skip to content

Commit 144d72d

Browse files
committed
Update documentation from main repository
1 parent f19fe14 commit 144d72d

30 files changed

+956
-1184
lines changed

docs/README.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
# ServerlessLLM documents
2+
3+
Please find our documents in [ServerlessLLM](https://serverlessllm.github.io/docs/stable/getting_started/quickstart).
4+
5+
## How to build ServerlessLLM Docs
6+
7+
This website is built using Docusaurus, a modern static website generator.
8+
9+
### Installation
10+
11+
To install the necessary dependencies, use the following command:
12+
13+
```bash
14+
npm install
15+
```
16+
17+
### Local Development
18+
19+
To start a local development server and open up a browser window, use the following command:
20+
21+
```bash
22+
npm run start
23+
```
24+
25+
Most changes are reflected live without having to restart the server.
26+
27+
### Build
28+
29+
To generate static content into the build directory, use the following command:
30+
31+
```bash
32+
npm run build
33+
```
34+
35+
This command generates static content into the `build` directory, which can be served using any static content hosting service.
36+
37+
### About the image path
38+
39+
Images are stored in `images` path. For example, we have an image called `a.jpg` in `images`. When we use this image in any position in the documents, we just use `/img/a.jpg`. (The document sync bot can copy `images` path into `img` folder in `serverlessllm.github.io` repo)

docs/stable/cli/cli_api.md renamed to docs/api/cli.md

Lines changed: 71 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,40 @@
1+
---
2+
sidebar_position: 2
3+
---
4+
5+
# CLI API
6+
17
## ServerlessLLM CLI Documentation
28

39
### Overview
4-
`sllm-cli` is a command-line interface (CLI) tool designed for managing and interacting with ServerlessLLM models. This document provides an overview of the available commands and their usage.
10+
`sllm-cli` is a command-line interface (CLI) tool designed to manage and interact with ServerlessLLM models. This document provides an overview of the available commands and their usage.
11+
12+
### Installation
13+
14+
```bash
15+
# Create a new environment
16+
conda create -n sllm python=3.10 -y
17+
conda activate sllm
18+
19+
# Install ServerlessLLM
20+
pip install serverless-llm
21+
```
522

623
### Getting Started
724

825
Before using the `sllm-cli` commands, you need to start the ServerlessLLM cluster. Follow the guides below to set up your cluster:
926

10-
- [Installation Guide](../getting_started/installation.md)
11-
- [Docker Quickstart Guide](../getting_started/docker_quickstart.md)
12-
- [Quickstart Guide](../getting_started/quickstart.md)
27+
- [Single Machine Deployment](../stable/gettting_started.md)
28+
- [Single Machine Deployment (From Scratch)](../stable/deployment/single_machine.md)
29+
- [Multi-Machine Deployment](../stable/deployment/multi_machine.md)
30+
- [SLURM Cluster Deployment](../stable/deployment/slurm_cluster.md)
1331

1432
After setting up the ServerlessLLM cluster, you can use the commands listed below to manage and interact with your models.
1533

1634
### Example Workflow
1735

1836
1. **Deploy a Model**
19-
> Deploy a model using the model name, which must be a HuggingFace pretrained model name. i.e. "facebook/opt-1.3b" instead of "opt-1.3b".
37+
> Deploy a model using the model name, which must be a HuggingFace pretrained model name. i.e. `facebook/opt-1.3b` instead of `opt-1.3b`.
2038
```bash
2139
sllm-cli deploy --model facebook/opt-1.3b
2240
```
@@ -45,6 +63,8 @@ After setting up the ServerlessLLM cluster, you can use the commands listed belo
4563
### sllm-cli deploy
4664
Deploy a model using a configuration file or model name, with options to overwrite default configurations. The configuration file requires minimal specifications, as sensible defaults are provided for advanced configuration options.
4765

66+
This command also supports [PEFT LoRA (Low-Rank Adaptation)](https://huggingface.co/docs/peft/main/en/index), allowing you to deploy adapters on top of a base model, either via CLI flags or directly in the configuration file.
67+
4868
For more details on the advanced configuration options and their default values, please refer to the [Example Configuration File](#example-configuration-file-configjson) section.
4969

5070
##### Usage
@@ -74,6 +94,12 @@ sllm-cli deploy [OPTIONS]
7494
- `--max-instances <number>`
7595
- Overwrite the maximum instances in the default configuration.
7696

97+
- `--enable-lora`
98+
- Enable LoRA adapter support for the transformers backend. Overwrite `enable_lora` in the default configuration.
99+
100+
- `--lora-adapters`
101+
- Add one or more LoRA adapters in the format `<name>=<path>`. Overwrite any existing `lora_adapters` in the default configuration.
102+
77103
##### Examples
78104
Deploy using a model name with default configuration:
79105
```bash
@@ -95,6 +121,11 @@ Deploy using a model name and overwrite multiple configurations:
95121
sllm-cli deploy --model facebook/opt-1.3b --num-gpus 2 --target 5 --min-instances 1 --max-instances 5
96122
```
97123

124+
Deploy a base model with multiple LoRA adapters:
125+
```bash
126+
sllm-cli deploy --model facebook/opt-1.3b --backend transformers --enable-lora --lora-adapters demo_lora1=crumb/FLAN-OPT-1.3b-LoRA demo_lora2=GrantC/alpaca-opt-1.3b-lora
127+
```
128+
98129
##### Example Configuration File (`config.json`)
99130
This file can be incomplete, and missing sections will be filled in by the default configuration:
100131
```json
@@ -113,7 +144,12 @@ This file can be incomplete, and missing sections will be filled in by the defau
113144
"pretrained_model_name_or_path": "facebook/opt-1.3b",
114145
"device_map": "auto",
115146
"torch_dtype": "float16",
116-
"hf_model_class": "AutoModelForCausalLM"
147+
"hf_model_class": "AutoModelForCausalLM",
148+
"enable_lora": true,
149+
"lora_adapters": {
150+
"demo_lora1": "crumb/FLAN-OPT-1.3b-LoRA",
151+
"demo_lora2": "GrantC/alpaca-opt-1.3b-lora"
152+
}
117153
}
118154
}
119155
```
@@ -136,23 +172,38 @@ Below is a description of all the fields in config.json.
136172
| backend_config.device_map | Device map config used to load the model, `auto` is suitable for most scenarios. |
137173
| backend_config.torch_dtype | Torch dtype of the model. |
138174
| backend_config.hf_model_class | HuggingFace model class. |
175+
| backend_config.enable_lora | Set to true to enable loading LoRA adapters during inference. |
176+
| backend_config.lora_adapters| A dictionary of LoRA adapters in the format `{name: path}`, where each path is a local or Hugging Face-hosted LoRA adapter directory. |
139177

140178
### sllm-cli delete
141-
Delete deployed models by name.
179+
Delete deployed models by name, or delete specific LoRA adapters associated with a base model.
180+
181+
This command supports:
182+
- Removing deployed models
183+
- Removing specific LoRA adapters while preserving the base model
142184

143185
##### Usage
144186
```bash
145-
sllm-cli delete [MODELS]
187+
sllm-cli delete [MODELS] [OPTIONS]
146188
```
147189

148190
##### Arguments
149191
- `MODELS`
150192
- Space-separated list of model names to delete.
151193

194+
##### Options
195+
- `--lora-adapters <adapter_names>`
196+
- Space-separated list of LoRA adapter names to delete from the given model. If provided, the base model will not be deleted — only the specified adapters will be removed.
197+
152198
##### Example
199+
Delete multiple base models (and all their adapters):
153200
```bash
154201
sllm-cli delete facebook/opt-1.3b facebook/opt-2.7b meta/llama2
155202
```
203+
Delete specific LoRA adapters from a base model, keeping the base model:
204+
```bash
205+
sllm-cli delete facebook/opt-1.3b --lora-adapters demo_lora1 demo_lora2
206+
```
156207

157208
### sllm-cli generate
158209
Generate outputs using the deployed model.
@@ -217,7 +268,7 @@ sllm-cli encode --threads 4 /path/to/request.json
217268
"model": "intfloat/e5-mistral-7b-instruct",
218269
"task_instruct": "Given a question, retrieve passages that answer the question",
219270
"query": [
220-
"Hi, How are you?"
271+
"Hi, how are you?"
221272
]
222273
}
223274
```
@@ -267,7 +318,7 @@ sllm-cli update --config /path/to/config.json
267318
```
268319

269320
### sllm-cli fine-tuning
270-
Fine-tuning the deployed model.
321+
Fine-tune the deployed model.
271322

272323
##### Usage
273324
```bash
@@ -289,20 +340,19 @@ sllm-cli fine-tuning --base-model <model_name> --config <path_to_ft_config_file>
289340
##### Example Configuration File (`ft_config.json`)
290341
```json
291342
{
292-
"model": "bigscience/bloomz-560m",
343+
"model": "facebook/opt-125m",
293344
"ft_backend": "peft",
294345
"dataset_config": {
295346
"dataset_source": "hf_hub",
296347
"hf_dataset_name": "fka/awesome-chatgpt-prompts",
297348
"tokenization_field": "prompt",
298-
"split": "train[:10%]",
349+
"split": "train",
299350
"data_files": "",
300351
"extension_type": ""
301352
},
302353
"lora_config": {
303354
"r": 4,
304355
"lora_alpha": 1,
305-
"target_modules": ["query_key_value"],
306356
"lora_dropout": 0.05,
307357
"bias": "lora_only",
308358
"task_type": "CAUSAL_LM"
@@ -322,26 +372,28 @@ Below is a description of all the fields in ft_config.json.
322372
|--------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
323373
| model | This should be a deployed model name, used to identify the backend instance. |
324374
| ft_backend | fine-tuning engine, only support `peft` now. |
325-
| dataset_config | Config about the fine-tuning dataset |
375+
| dataset_config | Configuration for the fine-tuning dataset |
326376
| dataset_config.dataset_source | dataset is from `hf_hub` (huggingface_hub) or `local` file |
327377
| dataset_config.hf_dataset_name | dataset name on huggingface_hub |
328378
| dataset_config.tokenization_field | the field to tokenize |
329379
| dataset_config.split | Partitioning of the dataset (`train`, `validation` and `test`), You can also split the selected dataset, e.g. take only the top 10% of the training data: train[:10%] |
330380
| dataset_config.data_files | data files will be loaded from local |
331381
| dataset_config.extension_type | extension type of data files (`csv`, `json`, `parquet`, `arrow`) |
332-
| lora_config | Config about lora |
382+
| lora_config | Configuration for LoRA fine-tuning |
333383
| lora_config.r | `r` defines how many parameters will be trained. |
334384
| lora_config.lora_alpha | A multiplier controlling the overall strength of connections within a neural network, typically set at 1 |
335-
| lora_config.target_modules | a list of the target_modules available on the [Hugging Face Documentation](https://github.com/huggingface/peft/blob/39ef2546d5d9b8f5f8a7016ec10657887a867041/src/peft/utils/other.py#L220) |
385+
| lora_config.target_modules | a list of the target_modules available on the [Hugging Face Documentation][1] |
336386
| lora_config.lora_dropout | used to avoid overfitting |
337387
| lora_config.bias | use `none` or `lora_only` |
338388
| lora_config.task_type | Indicates the task the model is begin trained for |
339-
| training_config | Config about training parameters |
389+
| training_config | Configuration for training parameters |
340390
| training_config.auto_find_batch_size | Find a correct batch size that fits the size of Data. |
341391
| training_config.num_train_epochs | Total number of training rounds |
342392
| training_config.learning_rate | learning rate |
343393
| training_config.optim | select an optimiser |
344-
| training_config.use_cpu | if use cpu to train |
394+
| training_config.use_cpu | whether to use CPU for training |
395+
396+
[1]: https://github.com/huggingface/peft/blob/39ef2546d5d9b8f5f8a7016ec10657887a867041/src/peft/utils/other.py#L220
345397

346398
### sllm-cli status
347399
Check the information of deployed models
@@ -354,4 +406,4 @@ sllm-cli status
354406
#### Example
355407
```bash
356408
sllm-cli status
357-
```
409+
```

docs/api/intro.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,8 @@
22
sidebar_position: 1
33
---
44

5-
# API docs
5+
# API Introduction
6+
7+
Welcome to the ServerlessLLM API documentation. This section contains detailed information about the various APIs provided by ServerlessLLM:
8+
9+
- [CLI API](./cli.md) - Documentation for the `sllm-cli` command-line interface

docs/images/wechat.png

7.24 KB
Loading

docs/stable/cli/_category_.json

Lines changed: 0 additions & 4 deletions
This file was deleted.
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
{
2+
"label": "Deployment",
3+
"position": 3
4+
}

0 commit comments

Comments
 (0)