You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/glossary.md
+1-61Lines changed: 1 addition & 61 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -15,67 +15,7 @@ We offer both GPU and CPU serverless options:
15
15
16
16
## Worker
17
17
18
-
A single compute resource that processes requests. Each endpoint can have multiple workers, enabling parallel processing of multiple requests simultaneously
19
-
20
-
### Total Workers
21
-
22
-
Total Workers refers to the maximum number of workers available to your account. The sum of the max workers assigned across all your endpoints cannot exceed this limit. If you run out of total workers, please reach out to us by [creating a support ticket](https://contact.runpod.io/).
23
-
24
-
### Max Workers
25
-
26
-
Max workers set the upper limit on the number of workers your endpoint can run simultaneously.
27
-
28
-
Default: 3
29
-
30
-
### Active (Min) Workers
31
-
32
-
“Always on” workers. Setting active workers to 1 or more ensures that a worker is always ready to respond to job requests without cold start delays.
33
-
34
-
Default: 0
35
-
36
-
:::note
37
-
38
-
Active workers incur charges as soon as you enable them (set to >0), but they come with a discount of up to 30% off the regular price.
39
-
40
-
:::
41
-
42
-
### Flex Workers
43
-
44
-
Flex Workers are “sometimes on” workers that help scale your endpoint during traffic surges. They are often referred to as idle workers since they spend most of their time in an idle state. Once a flex worker completes a job, it transitions to idle or sleep mode to save costs. You can adjust the idle timeout to keep them running a little longer, reducing cold start delays when new requests arrive.
45
-
46
-
Default: Max Workers(3) - Active Workers(0) = 3
47
-
48
-
### Extra Workers
49
-
50
-
RunPod caches your worker’s Docker image on our host servers, ensuring faster scalability. If you experience a traffic spike, you can increase the max number of workers, and extra workers will be immediately added as part of the flex workers to handle the increased demand.
51
-
52
-
Default: 2
53
-
54
-
### Worker States
55
-
56
-
#### Initializing
57
-
58
-
When you create a new endpoint or release an update, RunPod needs to download and prepare the Docker image for your workers. During this process, workers remain in an initializing state until they are fully ready to handle requests.
59
-
60
-
#### Idle
61
-
62
-
A worker is ready to handle new requests but is not actively processing any. There is no charge while a worker is idle.
63
-
64
-
#### Running
65
-
66
-
A running worker is actively processing requests, and you are billed every second it runs. If a worker runs for less than a full second, it will be rounded up to the next whole second. For example, if a worker runs for 2.5 seconds, you will be billed for 3 seconds.
67
-
68
-
#### Throttled
69
-
70
-
Sometimes, the machine where the worker is cached may be fully occupied by other workloads. In this case, the worker will show as throttled until resources become available.
71
-
72
-
#### Outdated
73
-
74
-
When you update your endpoint configuration or deploy a new Docker image, existing workers are marked as outdated. These workers will continue processing their current jobs but will be gradually replaced through a rolling update, replacing 10% of max workers at a time. This ensures a smooth transition without disrupting active workloads.
75
-
76
-
#### Unhealthy
77
-
78
-
When your container crashes, it’s usually due to a bad Docker image, an incorrect start command, or occasionally a machine issue. When this happens, the worker is marked as unhealthy. The system will automatically retry the unhealthy worker after 1 hour, using exponential backoff for up to 7 days. Be sure to check the container logs and fix any issues causing the crash to prevent repeated failures.
18
+
A [worker](./serverless/workers/overview.md) is a single compute resource that processes Serverless endpoint requests. Each endpoint can have multiple workers, enabling parallel processing of multiple requests simultaneously.
Copy file name to clipboardExpand all lines: docs/serverless/workers/overview.md
+57-11Lines changed: 57 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,19 +4,65 @@ sidebar_position: 1
4
4
description: "RunPod is a cloud-based platform for managed function execution, offering fully managed infrastructure, automatic scaling, flexible language support, and seamless integration, allowing developers to focus on code and deploy it easily."
5
5
---
6
6
7
-
Workers run your code in the cloud.
7
+
A worker is a single compute resource that processes Serverless endpoint requests. Each endpoint can have multiple workers, enabling parallel processing of multiple requests simultaneously.
8
8
9
-
### Key characteristics
9
+
##Total workers
10
10
11
-
-**Fully Managed Execution**: RunPod takes care of the underlying infrastructure, so your code runs whenever it's triggered, without any server setup or maintenance.
12
-
-**Automatic Scaling**: The platform scales your functions up or down based on the workload, ensuring efficient resource usage.
13
-
-**Flexible Language Support**: RunPod SDK supports various programming languages, allowing you to write functions in the language you're most comfortable with.
14
-
-**Seamless Integration**: Once your code is uploaded, RunPod provides an Endpoint, making it easy to integrate your Handler Functions into any part of your application.
11
+
The maximum number of workers available to your account. The sum of the max workers assigned across all your endpoints cannot exceed this limit. If you run out of total workers, please reach out to us by [creating a support ticket](https://contact.runpod.io/).
15
12
16
-
## Get started
13
+
## Max workers
17
14
18
-
To start using RunPod Workers:
15
+
The upper limit on the number of workers that your endpoint can run simultaneously, not including any [extra workers](#extra-workers).
19
16
20
-
1.**Write your function**: Code your Handler Functions in a supported language.
21
-
2.**Deploy to RunPod**: Upload your Handler Functions to RunPod.
22
-
3.**Integrate and Execute**: Use the provided Endpoint to integrate with your application.
17
+
Default: 3
18
+
19
+
## Active (Min) workers
20
+
21
+
“Always on” workers. Setting active workers to 1 or more ensures that a worker is always ready to respond to job requests without cold start delays.
22
+
23
+
Default: 0
24
+
25
+
:::note
26
+
27
+
Active workers incur charges as soon as you enable them (set to >0), but they come with a discount of up to 30% off the regular price.
28
+
29
+
:::
30
+
31
+
## Flex workers
32
+
33
+
Flex workers are “sometimes on” workers that help scale your endpoint during traffic surges. They are often referred to as idle workers since they spend most of their time in an idle state. Once a flex worker completes a job, it transitions to idle or sleep mode to save costs. You can adjust the idle timeout to keep them running a little longer, reducing cold start delays when new requests arrive.
34
+
35
+
Default: Max workers(3) - Active workers(0) = 3
36
+
37
+
## Extra workers
38
+
39
+
Your workers' Docker images are cached on our RunPod's host servers, ensuring faster scalability. If you experience a traffic spike, you can increase the max number of workers, and extra workers will be immediately added as part of the flex workers to handle the increased demand.
40
+
41
+
Default: 2
42
+
43
+
## Worker wtates
44
+
45
+
### Initializing
46
+
47
+
When you create a new endpoint or release an update, RunPod needs to download and prepare the Docker image for your workers. During this process, workers remain in an initializing state until they are fully ready to handle requests.
48
+
49
+
### Idle
50
+
51
+
A worker is ready to handle new requests but is not actively processing any. There is no charge while a worker is idle.
52
+
53
+
### Running
54
+
55
+
A running worker is actively processing requests, and you are billed every second it runs. If a worker runs for less than a full second, it will be rounded up to the next whole second. For example, if a worker runs for 2.5 seconds, you will be billed for 3 seconds.
56
+
57
+
### Throttled
58
+
59
+
Sometimes, the machine where the worker is cached may be fully occupied by other workloads. In this case, the worker will show as throttled until resources become available.
60
+
61
+
### Outdated
62
+
63
+
When you update your endpoint configuration or deploy a new Docker image, existing workers are marked as outdated. These workers will continue processing their current jobs but will be gradually replaced through a rolling update, replacing 10% of max workers at a time. This ensures a smooth transition without disrupting active workloads.
64
+
65
+
### Unhealthy
66
+
67
+
When your container crashes, it's usually due to a bad Docker image, an incorrect start command, or occasionally a machine issue. When this happens, the worker is marked as unhealthy. Be sure to check the container logs and fix any issues causing the crash to prevent repeated failures.
68
+
The system will automatically retry the unhealthy worker after 1 hour, continuing to retry with exponential backoff for up to 7 days. If the worker successfully takes a request from the queue during a retry attempt, it will be marked as healthy again.
0 commit comments