Skip to content

Commit 95b3e43

Browse files
committed
Document Sync by Tina
1 parent 8cae5e8 commit 95b3e43

File tree

1 file changed

+26
-81
lines changed

1 file changed

+26
-81
lines changed

docs/stable/getting_started/docker_quickstart.md

+26-81
Original file line numberDiff line numberDiff line change
@@ -4,92 +4,52 @@ sidebar_position: 2
44

55
# Docker Quickstart Guide
66

7-
This guide will help you get started with the basics of using ServerlessLLM with Docker. Please make sure you have Docker installed on your system and have installed ServerlessLLM CLI following the [installation guide](./installation.md).
7+
This guide shows how to quickly set up a local ServerlessLLM cluster using Docker Compose. We will start a cluster with a head node and two worker nodes, deploy and query a model using the `sllm-cli`.
88

99
## Pre-requisites
1010

11-
Ensure you have the following pre-requisites:
11+
Before you begin, make sure you have the following:
1212

13-
1. **GPUs**: Ensure you have at least 2 GPUs available. If more GPUs are provided, you can adjust the number of workers and the number of devices assigned to each worker.
14-
2. **NVIDIA Docker Toolkit**: This allows Docker to use NVIDIA GPUs. You can find the installation guide [here](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html).
13+
1. **Docker**: Installed on your system. You can download it from [here](https://docs.docker.com/get-docker/).
14+
2. **ServerlessLLM CLI**: Installed on your system. You can install it using `pip install serverless-llm`.
15+
1. **GPUs**: At least 2 NVIDIA GPUs are necessary. If you have more GPUs, you can adjust the `docker-compose.yml` file accordingly.
16+
2. **NVIDIA Docker Toolkit**: This allows Docker to use NVIDIA GPUs. Follow the installation guide [here](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html).
1517

16-
## Run ServerlessLLM using Docker
18+
## Run ServerlessLLM using Docker Compose
1719

18-
First, let's start a local Docker-based ray cluster to run ServerlessLLM.
20+
We will use docker compose to simplify the setup of ServerlessLLM. The `docker-compose.yml` file is located in the `examples/docker/` directory of the ServerlessLLM repository.
1921

20-
### Step 1: Build Docker Images
22+
### Step 1: Clone the ServerlessLLM Repository
2123

22-
Run the following commands to build the Docker images:
24+
If you haven't already, clone the ServerlessLLM repository:
2325

2426
```bash
25-
docker build . -t serverlessllm/sllm-serve
26-
docker build -f Dockerfile.worker . -t serverlessllm/sllm-serve-worker
27+
git clone https://github.com/serverless-llm/serverlessllm.git
28+
cd serverlessllm/examples/docker/
2729
```
2830

29-
### Step 2: Configuration
31+
### Step 2: Configuration
3032

31-
Ensure that you have a directory for storing your models and set the `MODEL_FOLDER` environment variable to this directory:
33+
Set the Model Directory
34+
Create a directory on your host machine where models will be stored and set the MODEL_FOLDER environment variable to point to this directory:
3235

3336
```bash
34-
export MODEL_FOLDER=path/to/models
37+
export MODEL_FOLDER=/path/to/your/models
3538
```
3639

37-
Also, check if the Docker network `sllm` exists and create it if it doesn't:
40+
Replace /path/to/your/models with the actual path where you want to store the models.
3841

39-
```bash
40-
if ! docker network ls | grep -q "sllm"; then
41-
echo "Docker network 'sllm' does not exist. Creating network..."
42-
docker network create sllm
43-
else
44-
echo "Docker network 'sllm' already exists."
45-
fi
46-
```
47-
48-
### Step 3: Start the Ray Head and Worker Nodes
49-
50-
Run the following commands to start the Ray head node and worker nodes:
51-
52-
#### Start Ray Head Node
42+
### Step 3: Start the Services
5343

54-
```bash
55-
docker run -d --name ray_head \
56-
--runtime nvidia \
57-
--network sllm \
58-
-p 6379:6379 \
59-
-p 8343:8343 \
60-
--gpus '"device=none"' \
61-
serverlessllm/sllm-serve
62-
```
63-
64-
#### Start Ray Worker Nodes
44+
Start the ServerlessLLM services using docker compose:
6545

6646
```bash
67-
docker run -d --name ray_worker_0 \
68-
--runtime nvidia \
69-
--network sllm \
70-
--gpus '"device=0"' \
71-
--env WORKER_ID=0 \
72-
--mount type=bind,source=$MODEL_FOLDER,target=/models \
73-
serverlessllm/sllm-serve-worker
74-
75-
docker run -d --name ray_worker_1 \
76-
--runtime nvidia \
77-
--network sllm \
78-
--gpus '"device=1"' \
79-
--env WORKER_ID=1 \
80-
--mount type=bind,source=$MODEL_FOLDER,target=/models \
81-
serverlessllm/sllm-serve-worker
47+
docker compose up -d --build
8248
```
8349

84-
### Step 4: Start ServerlessLLM Serve
50+
This command will start the Ray head node and two worker nodes defined in the `docker-compose.yml` file.
8551

86-
Run the following command to start the ServerlessLLM serve:
87-
88-
```bash
89-
docker exec ray_head sh -c "/opt/conda/bin/sllm-serve start"
90-
```
91-
92-
### Step 5: Deploy a Model Using sllm-cli
52+
### Step 4: Deploy a Model Using sllm-cli
9353

9454
Open a new terminal, activate the `sllm` environment, and set the `LLM_SERVER_URL` environment variable:
9555

@@ -113,7 +73,7 @@ INFO 08-01 07:38:12 deploy.py:36] Deploying model facebook/opt-1.3b with default
11373
INFO 08-01 07:39:00 deploy.py:49] Model registered successfully.
11474
```
11575

116-
### Step 6: Query the Model
76+
### Step 5: Query the Model
11777

11878
Now, you can query the model by any OpenAI API client. For example, you can use the following Python code to query the model:
11979
```bash
@@ -134,7 +94,7 @@ Expected output:
13494
{"id":"chatcmpl-8b4773e9-a98b-41db-8163-018ed3dc65e2","object":"chat.completion","created":1720183759,"model":"facebook/opt-1.3b","choices":[{"index":0,"message":{"role":"assistant","content":"system: You are a helpful assistant.\nuser: What is your name?\nsystem: I am a helpful assistant.\n"},"logprobs":null,"finish_reason":"stop"}],"usage":{"prompt_tokens":16,"completion_tokens":26,"total_tokens":42}}%
13595
```
13696

137-
### Deleting a Model
97+
### Step 6: Clean Up
13898
To delete a deployed model, use the following command:
13999

140100
```bash
@@ -143,22 +103,7 @@ sllm-cli delete facebook/opt-1.3b
143103

144104
This will remove the specified model from the ServerlessLLM server.
145105

146-
You can also remove several models at once by providing multiple model names separated by spaces:
147-
106+
To stop the ServerlessLLM services, use the following command:
148107
```bash
149-
sllm-cli delete facebook/opt-1.3b facebook/opt-2.7b
150-
```
151-
152-
### Cleanup
153-
154-
If you need to stop and remove the containers, you can use the following commands:
155-
156-
```bash
157-
docker exec ray_head sh -c "ray stop"
158-
docker exec ray_worker_0 sh -c "ray stop"
159-
docker exec ray_worker_1 sh -c "ray stop"
160-
161-
docker stop ray_head ray_worker_0 ray_worker_1
162-
docker rm ray_head ray_worker_0 ray_worker_1
163-
docker network rm sllm
108+
docker compose down
164109
```

0 commit comments

Comments
 (0)