You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/stable/getting_started/docker_quickstart.md
+26-81
Original file line number
Diff line number
Diff line change
@@ -4,92 +4,52 @@ sidebar_position: 2
4
4
5
5
# Docker Quickstart Guide
6
6
7
-
This guide will help you get started with the basics of using ServerlessLLM with Docker. Please make sure you have Docker installed on your system and have installed ServerlessLLM CLI following the [installation guide](./installation.md).
7
+
This guide shows how to quickly set up a local ServerlessLLM cluster using Docker Compose. We will start a cluster with a head node and two worker nodes, deploy and query a model using the `sllm-cli`.
8
8
9
9
## Pre-requisites
10
10
11
-
Ensure you have the following pre-requisites:
11
+
Before you begin, make sure you have the following:
12
12
13
-
1.**GPUs**: Ensure you have at least 2 GPUs available. If more GPUs are provided, you can adjust the number of workers and the number of devices assigned to each worker.
14
-
2.**NVIDIA Docker Toolkit**: This allows Docker to use NVIDIA GPUs. You can find the installation guide [here](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html).
13
+
1.**Docker**: Installed on your system. You can download it from [here](https://docs.docker.com/get-docker/).
14
+
2.**ServerlessLLM CLI**: Installed on your system. You can install it using `pip install serverless-llm`.
15
+
1.**GPUs**: At least 2 NVIDIA GPUs are necessary. If you have more GPUs, you can adjust the `docker-compose.yml` file accordingly.
16
+
2.**NVIDIA Docker Toolkit**: This allows Docker to use NVIDIA GPUs. Follow the installation guide [here](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html).
15
17
16
-
## Run ServerlessLLM using Docker
18
+
## Run ServerlessLLM using Docker Compose
17
19
18
-
First, let's start a local Docker-based ray cluster to run ServerlessLLM.
20
+
We will use docker compose to simplify the setup of ServerlessLLM. The `docker-compose.yml` file is located in the `examples/docker/` directory of the ServerlessLLM repository.
19
21
20
-
### Step 1: Build Docker Images
22
+
### Step 1: Clone the ServerlessLLM Repository
21
23
22
-
Run the following commands to build the Docker images:
24
+
If you haven't already, clone the ServerlessLLM repository:
This command will start the Ray head node and two worker nodes defined in the `docker-compose.yml` file.
85
51
86
-
Run the following command to start the ServerlessLLM serve:
87
-
88
-
```bash
89
-
docker exec ray_head sh -c "/opt/conda/bin/sllm-serve start"
90
-
```
91
-
92
-
### Step 5: Deploy a Model Using sllm-cli
52
+
### Step 4: Deploy a Model Using sllm-cli
93
53
94
54
Open a new terminal, activate the `sllm` environment, and set the `LLM_SERVER_URL` environment variable:
95
55
@@ -113,7 +73,7 @@ INFO 08-01 07:38:12 deploy.py:36] Deploying model facebook/opt-1.3b with default
113
73
INFO 08-01 07:39:00 deploy.py:49] Model registered successfully.
114
74
```
115
75
116
-
### Step 6: Query the Model
76
+
### Step 5: Query the Model
117
77
118
78
Now, you can query the model by any OpenAI API client. For example, you can use the following Python code to query the model:
119
79
```bash
@@ -134,7 +94,7 @@ Expected output:
134
94
{"id":"chatcmpl-8b4773e9-a98b-41db-8163-018ed3dc65e2","object":"chat.completion","created":1720183759,"model":"facebook/opt-1.3b","choices":[{"index":0,"message":{"role":"assistant","content":"system: You are a helpful assistant.\nuser: What is your name?\nsystem: I am a helpful assistant.\n"},"logprobs":null,"finish_reason":"stop"}],"usage":{"prompt_tokens":16,"completion_tokens":26,"total_tokens":42}}%
135
95
```
136
96
137
-
### Deleting a Model
97
+
### Step 6: Clean Up
138
98
To delete a deployed model, use the following command:
0 commit comments