You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: docs/stable/serve/storage_aware_scheduling.md
+46-121
Original file line number
Diff line number
Diff line change
@@ -2,135 +2,67 @@
2
2
sidebar_position: 0
3
3
---
4
4
5
-
# Storage Aware Scheduling
5
+
# Storage Aware Scheduling with Docker Compose
6
6
7
7
## Pre-requisites
8
-
To enable storage aware model loading scheduling, a hardware configuration file is required.
9
-
For example, the following is a sample configuration file for two servers:
10
-
```bash
11
-
echo'{
12
-
"0": {
13
-
"host_size": "32GB",
14
-
"host_bandwidth": "24GB/s",
15
-
"disk_size": "128GB",
16
-
"disk_bandwidth": "5GB/s",
17
-
"network_bandwidth": "10Gbps"
18
-
},
19
-
"1": {
20
-
"host_size": "32GB",
21
-
"host_bandwidth": "24GB/s",
22
-
"disk_size": "128GB",
23
-
"disk_bandwidth": "5GB/s",
24
-
"network_bandwidth": "10Gbps"
25
-
}
26
-
}'> hardware_config.json
27
-
```
28
8
29
-
We will use Docker to run a ServerlessLLM cluster in this example. Therefore, please make sure you have read the [Docker Quickstart Guide](../getting_started/docker_quickstart.md) before proceeding.
9
+
We will use Docker Compose to run a ServerlessLLM cluster in this example. Therefore, please make sure you have read the [Docker Quickstart Guide](../getting_started/docker_quickstart.md) before proceeding.
30
10
31
11
## Usage
32
-
Start a local Docker-based ray cluster.
33
12
34
-
### Step 1: Start Ray Head Node and Worker Nodes
13
+
Start a local Docker-based ray cluster using Docker Compose.
35
14
36
-
1. Start the Ray head node.
15
+
### Step 1: Clone the ServerlessLLM Repository
16
+
17
+
If you haven't already, clone the ServerlessLLM repository:
cd ServerlessLLM/examples/storage_aware_scheduling
46
22
```
47
23
48
-
2. Start the Ray worker nodes.
24
+
### Step 2: Configuration
49
25
50
-
Ensure that you have a directory for storing your models and set the `MODEL_FOLDER` environment variable to this directory:
26
+
Set the Model Directory. Create a directory on your host machine where models will be stored and set the `MODEL_FOLDER` environment variable to point to this directory:
Replace `/path/to/your/models` with the actual path where you want to store the models.
73
33
74
-
### Step 2: Start ServerlessLLM Serve with Storage Aware Scheduler
34
+
### Step 3: Enable Storage Aware Scheduling in Docker Compose
75
35
76
-
1. Copy the hardware configuration file to the Ray head node.
36
+
The Docker Compose configuration is already located in the `examples/storage_aware_scheduling` directory. To activate storage-aware scheduling, ensure the `docker-compose.yml` file includes the necessary configurations(`sllm_head` service should include the `--enable_storage_aware` command).
37
+
38
+
:::tip
39
+
Recommend to adjust the number of GPUs and `mem_pool_size` based on the resources available on your machine.
In the `examples/storage_aware_scheduling` directory, the example configuration files (`config-opt-2.7b.json` and `config-opt-1.3b.json`) are already given.
132
64
133
-
> Note: Storage aware scheduling currently only supports "transformers" backend. Support for other backends will come soon.
65
+
> Note: Storage aware scheduling currently only supports the "transformers" backend. Support for other backends will come soon.
0 commit comments